Explorar el Código

docs: web scrapping with XPath (#5494)

* added docs

* add correct link

* typo

* A bit of typography

---------

Co-authored-by: Alexandre Alapetite <alexandre@alapetite.fr>
maTh hace 2 años
padre
commit
2f48509678
Se han modificado 2 ficheros con 26 adiciones y 0 borrados
  1. 2 0
      docs/en/users/02_First_steps.md
  2. 24 0
      docs/en/users/11_website_scraping.md

+ 2 - 0
docs/en/users/02_First_steps.md

@@ -25,3 +25,5 @@ Now that you’ve mastered basic use, it’s time to configure FreshRSS to impro
 * [Access your feeds on a mobile device](06_Mobile_access.md)
 * [Add some extensions](https://github.com/FreshRSS/Extensions)
 * [Frequently asked questions](07_Frequently_Asked_Questions.md)
+
+FreshRSS has a built-in engine that [scraps a website to create an own feed](11_website_scraping.md).

+ 24 - 0
docs/en/users/11_website_scraping.md

@@ -0,0 +1,24 @@
+# Website scraping
+
+FreshRSS has a built-in [Web scrapping](https://en.wikipedia.org/wiki/Web_scraping) engine that generates a feed from websites that have no RSS/Atom feed published.
+
+## How to add
+
+Go to “Subscription Management” where a new feed can be added.
+Change the “Type of feed source” to “HTML + XPath (Web scrapping)”.
+An additional list of text boxes to configure the web scraping.
+[XPath 1.0](https://www.w3.org/TR/xpath-10/) is used as traversing language.
+
+### Get the XPath path
+
+Firefox: the built-in “inspect” tool may be used to help create a valid XPath expression.
+Select the node in the HTML, right click with your mouse and chose “Copy” and “XPath”.
+The XPath is stored in your clipboard now.
+
+## Tipps & tricks
+
+- [Timezone of date](https://github.com/FreshRSS/FreshRSS/discussions/5483)
+
+## Recommended external manuals
+
+- [XPath Scraping with FreshRSS, by Dan Q](https://danq.me/2022/09/27/freshrss-xpath/) (September 2022)