Commit graph

79 commits

Author SHA1 Message Date
Simon Legner
eac7f96cdb /about: Obtain credits from docs metadata 2022-12-26 19:01:43 +01:00
Simon Legner
965be77c03 Image scraping: evaluate Content-Length header 2022-09-06 14:48:51 +02:00
Simon Legner
200e39ce90 outdated_state: omit 0. and 1. versions 2022-02-21 19:46:04 +01:00
Simon Legner
cb6865b0a8 thor updates:check group major/minor updates 2021-01-03 10:12:14 +01:00
Simon Legner
fbfa9de39c parse_cf_email.rb: fix URI.unescape is obsolete 2020-11-24 23:02:25 +01:00
Simon Legner
c10516cc62 doc.is_outdated: split on [-.], add unit tests 2020-11-19 22:05:50 +01:00
Jasper van Merle
b31dc9d0c2 Fix tests 2019-07-17 22:44:48 +02:00
Thibaut Courouble
bf003669ba Simplify file scraper setup; scrape files in the "docs/[slug]" directory 2018-11-25 16:57:37 -05:00
Thibaut Courouble
68b80bce36 Generate pretty JSON in docs.json manifest
To make it easier to track changes in Git.
2018-03-25 16:27:53 -04:00
Thibaut Courouble
0725a69af5 Store docs' metadata in meta.json files
To avoid relying on the filesystem for modified times.
2018-03-25 14:52:16 -04:00
Thibaut Courouble
dd8c80060a Fix :follow_links option not doing anything when set to false 2017-07-30 13:07:29 -04:00
Thibaut Courouble
6fc48db8af Improve error logging 2017-07-16 15:41:25 -04:00
Thibaut Courouble
ed5a5cadd9 Fix handling of invalid iframe URLs
Fixes #590.
2017-04-22 10:17:51 -04:00
Thibaut Courouble
c1ebb7a0b9 Improve Doc#name and Doc#slug 2017-03-04 10:58:05 -05:00
Andreas Stenius
b36f3f8095 core/doc: make sure name is usable as slug. 2017-03-04 10:50:10 -05:00
Thibaut Courouble
94470251fe Bump Ruby 2.4.0 2017-01-22 14:40:33 -05:00
Thibaut Courouble
4e41ed9f25 Add <base> support 2017-01-22 10:26:14 -05:00
Thibaut Courouble
6f0214eaf3 Make Docs::Parser return the entire document instead of <body> 2017-01-22 10:22:07 -05:00
Thibaut Courouble
0c8ca4e5fa Add SQLite documentation 2016-12-04 11:26:23 -05:00
Thibaut Courouble
721adf8e21 Don't rewrite data URIs 2016-10-10 11:09:17 -04:00
Thibaut Courouble
82d0725747 Improve ordering of entries and types 2016-09-04 10:46:54 -04:00
Thibaut Courouble
5bb96f804a Require all entries to have a name, path and type 2016-06-05 17:04:34 -04:00
nucular
034ecfae72 Replace File.basename in URL#relative_path_to because it doesn't handle special characters in URLs well 2016-05-29 11:04:59 -04:00
Thibaut Courouble
9e1b9ca2a9 Improve MDN/JavaScript scraper 2016-05-01 11:47:40 -04:00
Thibaut Courouble
70b19c238a Sort types/categories by number when they start with a number 2016-04-10 14:09:12 -04:00
Thibaut Courouble
d366e14ea7 Fix Docs::Parse#document? when document has no doctype 2016-04-10 10:16:24 -04:00
Thibaut Courouble
6c9fc464c2 Add :fix_urls_before_parse option for Angular doc 2016-03-26 17:11:19 -04:00
Thibaut Courouble
63c77322d3 Handle unencoded spaces in link hrefs 2016-01-30 13:51:06 -05:00
Thibaut Courouble
c3b9502657 Set version attributes before evaluating block
Ref #25.
2016-01-24 16:13:34 -05:00
Thibaut Courouble
3df9cfff98 Add support for blank and non-number version names
Ref #25.
2016-01-24 13:03:04 -05:00
Thibaut Courouble
16ddcb100c Simplify version path separator
Ref #25.
2016-01-24 13:03:03 -05:00
Thibaut Courouble
b67a02ed35 Add version to doc manifest
Ref #25.
2016-01-24 13:03:03 -05:00
Thibaut Courouble
b2d2066d96 Multi-version support
Ref #25.
2016-01-23 13:50:52 -05:00
Thibaut Courouble
bd6e27eca2 Optionally include 'release' and 'links' in docs manifest 2016-01-17 11:52:53 -05:00
Thibaut Courouble
a639aedcd9 Remove index_path and db_path from docs manifest 2016-01-17 09:32:52 -05:00
Thibaut Courouble
e1c0218230 Rename version -> release 2016-01-16 11:15:53 -05:00
Thibaut
3eb5ccb7ea Raise error and stop scraping on 4xx/5xx status code 2015-12-13 15:39:00 -05:00
Thibaut
6939865137 Finish Dojo scraper 2015-11-22 11:59:43 -05:00
ShaneQful
3465933543 Added dojo to devdocs & ability to define headers in scraper requests 2015-11-22 10:28:53 -05:00
Thibaut
7de19cf800 Make EntryIndex a unique index (don't add the same entry twice) 2015-04-26 18:25:11 -04:00
Thibaut
018628ea7d Add two-pass redirection rewriter
... to avoid having to maintain huge lists of redirects. This works by doing a first pass to detect which internal URL is redirected where, before doing a second (normal) pass that rewrites all these URLs (links) with their final destination. There's a bit of monkey-patching I'm not proud of, but this works(tm).
2015-04-05 17:46:07 -04:00
Thibaut
b29d6ca002 Move doc links to manifest 2015-03-22 16:00:42 -04:00
Thibaut
cf7f446738 Change home_url to a list of links 2015-03-14 16:51:55 -04:00
Thu Trang Pham
642c1cff7d Make sure that home_url can be nil 2015-03-03 11:34:29 -05:00
Thibaut
a59ef1cdb6 Add db_size attribute in doc manifest 2015-01-02 15:29:13 -05:00
Thibaut
456c4cb811 Add Store#size 2015-01-02 15:22:14 -05:00
Thibaut
bc5488faa2 Make docs mtime the greatest of the index and db files' mtime 2014-12-31 14:12:39 -05:00
Thibaut
ca7ff6086e Exclude docs without a db file from the manifest 2014-12-31 14:11:30 -05:00
Thibaut
ca61a2b746 Add Doc#db_path 2014-12-31 14:00:20 -05:00
Thibaut
5c46eabc67 Output a JSON file containing all the pages' content 2014-12-31 13:54:29 -05:00