Simon Legner
eac7f96cdb
/about: Obtain credits from docs metadata
2022-12-26 19:01:43 +01:00
Simon Legner
965be77c03
Image scraping: evaluate Content-Length header
2022-09-06 14:48:51 +02:00
Simon Legner
200e39ce90
outdated_state: omit 0. and 1. versions
2022-02-21 19:46:04 +01:00
Simon Legner
cb6865b0a8
thor updates:check group major/minor updates
2021-01-03 10:12:14 +01:00
Simon Legner
fbfa9de39c
parse_cf_email.rb: fix URI.unescape is obsolete
2020-11-24 23:02:25 +01:00
Simon Legner
c10516cc62
doc.is_outdated: split on [-.], add unit tests
2020-11-19 22:05:50 +01:00
Jasper van Merle
b31dc9d0c2
Fix tests
2019-07-17 22:44:48 +02:00
Thibaut Courouble
bf003669ba
Simplify file scraper setup; scrape files in the "docs/[slug]" directory
2018-11-25 16:57:37 -05:00
Thibaut Courouble
68b80bce36
Generate pretty JSON in docs.json manifest
...
To make it easier to track changes in Git.
2018-03-25 16:27:53 -04:00
Thibaut Courouble
0725a69af5
Store docs' metadata in meta.json files
...
To avoid relying on the filesystem for modified times.
2018-03-25 14:52:16 -04:00
Thibaut Courouble
dd8c80060a
Fix :follow_links option not doing anything when set to false
2017-07-30 13:07:29 -04:00
Thibaut Courouble
6fc48db8af
Improve error logging
2017-07-16 15:41:25 -04:00
Thibaut Courouble
ed5a5cadd9
Fix handling of invalid iframe URLs
...
Fixes #590 .
2017-04-22 10:17:51 -04:00
Thibaut Courouble
c1ebb7a0b9
Improve Doc#name and Doc#slug
2017-03-04 10:58:05 -05:00
Andreas Stenius
b36f3f8095
core/doc: make sure name is usable as slug.
2017-03-04 10:50:10 -05:00
Thibaut Courouble
94470251fe
Bump Ruby 2.4.0
2017-01-22 14:40:33 -05:00
Thibaut Courouble
4e41ed9f25
Add <base> support
2017-01-22 10:26:14 -05:00
Thibaut Courouble
6f0214eaf3
Make Docs::Parser return the entire document instead of <body>
2017-01-22 10:22:07 -05:00
Thibaut Courouble
0c8ca4e5fa
Add SQLite documentation
2016-12-04 11:26:23 -05:00
Thibaut Courouble
721adf8e21
Don't rewrite data URIs
2016-10-10 11:09:17 -04:00
Thibaut Courouble
82d0725747
Improve ordering of entries and types
2016-09-04 10:46:54 -04:00
Thibaut Courouble
5bb96f804a
Require all entries to have a name, path and type
2016-06-05 17:04:34 -04:00
nucular
034ecfae72
Replace File.basename in URL#relative_path_to because it doesn't handle special characters in URLs well
2016-05-29 11:04:59 -04:00
Thibaut Courouble
9e1b9ca2a9
Improve MDN/JavaScript scraper
2016-05-01 11:47:40 -04:00
Thibaut Courouble
70b19c238a
Sort types/categories by number when they start with a number
2016-04-10 14:09:12 -04:00
Thibaut Courouble
d366e14ea7
Fix Docs::Parse#document? when document has no doctype
2016-04-10 10:16:24 -04:00
Thibaut Courouble
6c9fc464c2
Add :fix_urls_before_parse option for Angular doc
2016-03-26 17:11:19 -04:00
Thibaut Courouble
63c77322d3
Handle unencoded spaces in link hrefs
2016-01-30 13:51:06 -05:00
Thibaut Courouble
c3b9502657
Set version attributes before evaluating block
...
Ref #25 .
2016-01-24 16:13:34 -05:00
Thibaut Courouble
3df9cfff98
Add support for blank and non-number version names
...
Ref #25 .
2016-01-24 13:03:04 -05:00
Thibaut Courouble
16ddcb100c
Simplify version path separator
...
Ref #25 .
2016-01-24 13:03:03 -05:00
Thibaut Courouble
b67a02ed35
Add version to doc manifest
...
Ref #25 .
2016-01-24 13:03:03 -05:00
Thibaut Courouble
b2d2066d96
Multi-version support
...
Ref #25 .
2016-01-23 13:50:52 -05:00
Thibaut Courouble
bd6e27eca2
Optionally include 'release' and 'links' in docs manifest
2016-01-17 11:52:53 -05:00
Thibaut Courouble
a639aedcd9
Remove index_path and db_path from docs manifest
2016-01-17 09:32:52 -05:00
Thibaut Courouble
e1c0218230
Rename version -> release
2016-01-16 11:15:53 -05:00
Thibaut
3eb5ccb7ea
Raise error and stop scraping on 4xx/5xx status code
2015-12-13 15:39:00 -05:00
Thibaut
6939865137
Finish Dojo scraper
2015-11-22 11:59:43 -05:00
ShaneQful
3465933543
Added dojo to devdocs & ability to define headers in scraper requests
2015-11-22 10:28:53 -05:00
Thibaut
7de19cf800
Make EntryIndex a unique index (don't add the same entry twice)
2015-04-26 18:25:11 -04:00
Thibaut
018628ea7d
Add two-pass redirection rewriter
...
... to avoid having to maintain huge lists of redirects. This works by doing a first pass to detect which internal URL is redirected where, before doing a second (normal) pass that rewrites all these URLs (links) with their final destination. There's a bit of monkey-patching I'm not proud of, but this works(tm).
2015-04-05 17:46:07 -04:00
Thibaut
b29d6ca002
Move doc links to manifest
2015-03-22 16:00:42 -04:00
Thibaut
cf7f446738
Change home_url to a list of links
2015-03-14 16:51:55 -04:00
Thu Trang Pham
642c1cff7d
Make sure that home_url can be nil
2015-03-03 11:34:29 -05:00
Thibaut
a59ef1cdb6
Add db_size attribute in doc manifest
2015-01-02 15:29:13 -05:00
Thibaut
456c4cb811
Add Store#size
2015-01-02 15:22:14 -05:00
Thibaut
bc5488faa2
Make docs mtime the greatest of the index and db files' mtime
2014-12-31 14:12:39 -05:00
Thibaut
ca7ff6086e
Exclude docs without a db file from the manifest
2014-12-31 14:11:30 -05:00
Thibaut
ca61a2b746
Add Doc#db_path
2014-12-31 14:00:20 -05:00
Thibaut
5c46eabc67
Output a JSON file containing all the pages' content
2014-12-31 13:54:29 -05:00