summaryrefslogtreecommitdiffstats
path: root/system/xidel/README
blob: 34602c26d493f7ca653f2df17fd60c268edb2690 (plain)
xidel (tool to extract data from HTML/XML/JSON files or pages)

Xidel is a command line tool to query data from HTML/XML web pages,
JSON-APIs and local files. It implements interpreters for XPath 2,
XPath 3, XQuery 1, XQuery 3, JSONiq, CSS selectors and custom pattern
matching.

XPath and CSS selectors are the most efficient way to select certain
elements from XML/HTML documents. JSONiq (with custom extensions)
is an easy way to select data from JSON. XQuery is a Turing-complete
superset of XPath and allows arbitrary data transformations and the
creation of new documents.

Pattern matching is for XML/HTML documents what regular expressions
are for plaintext, i.e. pattern matching behaves like a regular
expression over the space of tags, instead over the space of
characters.

Xidel implements a kind of internal pipes to pipe HTTP requests from
one query to the next, so there is no need to distinguish selecting
links and downloading the data referenced by them. Therefore arbitrary
complex queries going over arbitrary many pages can be executed with a
single call of Xidel.

Xidel is a powerful and complex tool, with a steep learning
curve. For examples, see the man page xidel(1), and also
/usr/doc/xidel-$VERSION/examples/. The full documentation is available
via "xidel --usage | less".