Improve output template (see desc)

* Objects can be traversed like `%(field.key1.key2)s`
* A number can be added to the field as `%(field+n)s`
* Deprecates `--autonumber-start`
This commit is contained in:
pukkandan 2021-04-15 18:01:16 +05:30
parent 26e2805c3f
commit a439a3a45c
No known key found for this signature in database
GPG key ID: 0F00D95A001F4698
5 changed files with 78 additions and 36 deletions

View file

@ -395,8 +395,6 @@ Then simply run `make`. You can also run `make yt-dlp` instead to compile only t
--output-na-placeholder TEXT Placeholder value for unavailable meta --output-na-placeholder TEXT Placeholder value for unavailable meta
fields in output filename template fields in output filename template
(default: "NA") (default: "NA")
--autonumber-start NUMBER Specify the start value for %(autonumber)s
(default is 1)
--restrict-filenames Restrict filenames to only ASCII --restrict-filenames Restrict filenames to only ASCII
characters, and avoid "&" and spaces in characters, and avoid "&" and spaces in
filenames filenames
@ -833,7 +831,19 @@ The `-o` option is used to indicate a template for the output file names while `
**tl;dr:** [navigate me to examples](#output-template-examples). **tl;dr:** [navigate me to examples](#output-template-examples).
The basic usage of `-o` is not to set any template arguments when downloading a single file, like in `yt-dlp -o funny_video.flv "https://some/video"` (hard-coding file extension like this is not recommended). However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. Date/time fields can also be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it inside the parantheses separated from the field name using a `>`. For example, `%(duration>%H-%M-%S)s`. The simplest usage of `-o` is not to set any template arguments when downloading a single file, like in `yt-dlp -o funny_video.flv "https://some/video"` (hard-coding file extension like this is _not_ recommended and could break certain postprocessing).
It may however also contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations.
The field names themselves (the part inside the parenthesis) can also have some special formatting:
1. **Date/time Formatting**: Date/time fields can be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it separated from the field name using a `>`. Eg: `%(duration>%H-%M-%S)s` or `%(upload_date>%Y-%m-%d)s`
2. **Offset numbers**: Numeric fields can have an initial offset specified by using a `+` seperator. Eg: `%(playlist_index+10)03d`. This can also be used in conjunction with the datetime formatting. Eg: `%(epoch+-3600>%H-%M-%S)s`
3. **Object traversal**: The dictionaries and lists available in metadata can be traversed by using a `.` (dot) seperator. Eg: `%(tags.0)s` or `%(subtitles.en.-1.ext)`. Note that the fields that become available using this method are not listed below. Use `-j` to see such fields
To summarize, the general syntax for a field is:
```
%(name[.keys][+offset][>strf])[flags][width][.precision][length]type
```
Additionally, you can set different output templates for the various metadata files separately from the general output template by specifying the type of file followed by the template separated by a colon `:`. The different filetypes supported are `subtitle`, `thumbnail`, `description`, `annotation`, `infojson`, `pl_description`, `pl_infojson`, `chapter`. For example, `-o '%(title)s.%(ext)s' -o 'thumbnail:%(title)s\%(title)s.%(ext)s'` will put the thumbnails in a folder with the same name as the video. Additionally, you can set different output templates for the various metadata files separately from the general output template by specifying the type of file followed by the template separated by a colon `:`. The different filetypes supported are `subtitle`, `thumbnail`, `description`, `annotation`, `infojson`, `pl_description`, `pl_infojson`, `chapter`. For example, `-o '%(title)s.%(ext)s' -o 'thumbnail:%(title)s\%(title)s.%(ext)s'` will put the thumbnails in a folder with the same name as the video.
@ -1263,6 +1273,7 @@ These are all the deprecated options and the current alternative to achieve the
--all-formats -f all --all-formats -f all
--all-subs --sub-langs all --write-subs --all-subs --sub-langs all --write-subs
--autonumber-size NUMBER Use string formatting. Eg: %(autonumber)03d --autonumber-size NUMBER Use string formatting. Eg: %(autonumber)03d
--autonumber-start NUMBER Use internal field formatting like %(autonumber+NUMBER)s
--metadata-from-title FORMAT --parse-metadata "%(title)s:FORMAT" --metadata-from-title FORMAT --parse-metadata "%(title)s:FORMAT"
--prefer-avconv avconv is no longer officially supported (Alias: --no-prefer-ffmpeg) --prefer-avconv avconv is no longer officially supported (Alias: --no-prefer-ffmpeg)
--prefer-ffmpeg Default (Alias: --no-prefer-avconv) --prefer-ffmpeg Default (Alias: --no-prefer-avconv)

View file

@ -655,6 +655,8 @@ class TestYoutubeDL(unittest.TestCase):
'height': 1080, 'height': 1080,
'title1': '$PATH', 'title1': '$PATH',
'title2': '%PATH%', 'title2': '%PATH%',
'timestamp': 1618488000,
'formats': [{'id': 'id1'}, {'id': 'id2'}]
} }
def fname(templ, na_placeholder='NA'): def fname(templ, na_placeholder='NA'):
@ -671,6 +673,7 @@ class TestYoutubeDL(unittest.TestCase):
# Or by provided placeholder # Or by provided placeholder
self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder='none'), 'none-none-1234.mp4') self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder='none'), 'none-none-1234.mp4')
self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder=''), '--1234.mp4') self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder=''), '--1234.mp4')
self.assertEqual(fname('%(height)s.%(ext)s'), '1080.mp4')
self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4') self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4')
self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4') self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4')
self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4') self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4')
@ -688,6 +691,12 @@ class TestYoutubeDL(unittest.TestCase):
self.assertEqual(fname('%%(width)06d.%(ext)s'), '%(width)06d.mp4') self.assertEqual(fname('%%(width)06d.%(ext)s'), '%(width)06d.mp4')
self.assertEqual(fname('Hello %(title1)s'), 'Hello $PATH') self.assertEqual(fname('Hello %(title1)s'), 'Hello $PATH')
self.assertEqual(fname('Hello %(title2)s'), 'Hello %PATH%') self.assertEqual(fname('Hello %(title2)s'), 'Hello %PATH%')
self.assertEqual(fname('%(timestamp+-1000>%H-%M-%S)s'), '11-43-20')
self.assertEqual(fname('%(id+1)05d'), '01235')
self.assertEqual(fname('%(width+100)05d'), 'NA')
self.assertEqual(fname('%(formats.0)s').replace("u", ""), "{'id' - 'id1'}")
self.assertEqual(fname('%(formats.-1.id)s'), 'id2')
self.assertEqual(fname('%(formats.2)s'), 'NA')
def test_format_note(self): def test_format_note(self):
ydl = YoutubeDL() ydl = YoutubeDL()

View file

@ -99,6 +99,7 @@ from .utils import (
strftime_or_none, strftime_or_none,
subtitles_filename, subtitles_filename,
to_high_limit_path, to_high_limit_path,
traverse_dict,
UnavailableVideoError, UnavailableVideoError,
url_basename, url_basename,
version_tuple, version_tuple,
@ -796,6 +797,7 @@ class YoutubeDL(object):
def prepare_outtmpl(self, outtmpl, info_dict, sanitize=None): def prepare_outtmpl(self, outtmpl, info_dict, sanitize=None):
""" Make the template and info_dict suitable for substitution (outtmpl % info_dict)""" """ Make the template and info_dict suitable for substitution (outtmpl % info_dict)"""
template_dict = dict(info_dict) template_dict = dict(info_dict)
na = self.params.get('outtmpl_na_placeholder', 'NA')
# duration_string # duration_string
template_dict['duration_string'] = ( # %(duration>%H-%M-%S)s is wrong if duration > 24hrs template_dict['duration_string'] = ( # %(duration>%H-%M-%S)s is wrong if duration > 24hrs
@ -821,18 +823,10 @@ class YoutubeDL(object):
elif template_dict.get('width'): elif template_dict.get('width'):
template_dict['resolution'] = '%dx?' % template_dict['width'] template_dict['resolution'] = '%dx?' % template_dict['width']
if sanitize is None:
sanitize = lambda k, v: v
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
for k, v in template_dict.items()
if v is not None and not isinstance(v, (list, tuple, dict)))
na = self.params.get('outtmpl_na_placeholder', 'NA')
template_dict = collections.defaultdict(lambda: na, template_dict)
# For fields playlist_index and autonumber convert all occurrences # For fields playlist_index and autonumber convert all occurrences
# of %(field)s to %(field)0Nd for backward compatibility # of %(field)s to %(field)0Nd for backward compatibility
field_size_compat_map = { field_size_compat_map = {
'playlist_index': len(str(template_dict['n_entries'])), 'playlist_index': len(str(template_dict.get('n_entries', na))),
'autonumber': autonumber_size, 'autonumber': autonumber_size,
} }
FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s' FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s'
@ -844,32 +838,51 @@ class YoutubeDL(object):
outtmpl) outtmpl)
numeric_fields = list(self._NUMERIC_FIELDS) numeric_fields = list(self._NUMERIC_FIELDS)
if sanitize is None:
sanitize = lambda k, v: v
# Format date # Internal Formatting = name.key1.key2+number>strf
FORMAT_DATE_RE = FORMAT_RE.format(r'(?P<key>(?P<field>\w+)>(?P<format>.+?))') INTERNAL_FORMAT_RE = FORMAT_RE.format(
for mobj in re.finditer(FORMAT_DATE_RE, outtmpl): r'''(?P<final_key>
conv_type, field, frmt, key = mobj.group('type', 'field', 'format', 'key') (?P<fields>\w+(?:\.[-\w]+)*)
if key in template_dict: (?:\+(?P<add>-?\d+(?:\.\d+)?))?
continue (?:>(?P<strf_format>.+?))?
value = strftime_or_none(template_dict.get(field), frmt, na) )''')
if conv_type in 'crs': # string for mobj in re.finditer(INTERNAL_FORMAT_RE, outtmpl):
value = sanitize(field, value) mobj = mobj.groupdict()
else: # number # Object traversal
numeric_fields.append(key) fields = mobj['fields'].split('.')
value = float_or_none(value, default=None) final_key = mobj['final_key']
value = traverse_dict(template_dict, fields)
# Offset the value
if mobj['add']:
value = float_or_none(value)
if value is not None: if value is not None:
template_dict[key] = value value = value + float(mobj['add'])
# Datetime formatting
if mobj['strf_format']:
value = strftime_or_none(value, mobj['strf_format'])
if mobj['type'] in 'crs' and value is not None: # string
value = sanitize('%{}'.format(mobj['type']) % fields[-1], value)
else: # numeric
numeric_fields.append(final_key)
value = float_or_none(value)
if value is not None:
template_dict[final_key] = value
# Missing numeric fields used together with integer presentation types # Missing numeric fields used together with integer presentation types
# in format specification will break the argument substitution since # in format specification will break the argument substitution since
# string NA placeholder is returned for missing fields. We will patch # string NA placeholder is returned for missing fields. We will patch
# output template for missing fields to meet string presentation type. # output template for missing fields to meet string presentation type.
for numeric_field in numeric_fields: for numeric_field in numeric_fields:
if numeric_field not in template_dict: if template_dict.get(numeric_field) is None:
outtmpl = re.sub( outtmpl = re.sub(
FORMAT_RE.format(re.escape(numeric_field)), FORMAT_RE.format(re.escape(numeric_field)),
r'%({0})s'.format(numeric_field), outtmpl) r'%({0})s'.format(numeric_field), outtmpl)
template_dict = collections.defaultdict(lambda: na, (
(k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
for k, v in template_dict.items() if v is not None))
return outtmpl, template_dict return outtmpl, template_dict
def _prepare_filename(self, info_dict, tmpl_type='default'): def _prepare_filename(self, info_dict, tmpl_type='default'):

View file

@ -908,7 +908,7 @@ def parseOpts(overrideArguments=None):
filesystem.add_option( filesystem.add_option(
'--autonumber-start', '--autonumber-start',
dest='autonumber_start', metavar='NUMBER', default=1, type=int, dest='autonumber_start', metavar='NUMBER', default=1, type=int,
help='Specify the start value for %(autonumber)s (default is %default)') help=optparse.SUPPRESS_HELP)
filesystem.add_option( filesystem.add_option(
'--restrict-filenames', '--restrict-filenames',
action='store_true', dest='restrictfilenames', default=False, action='store_true', dest='restrictfilenames', default=False,

View file

@ -6092,11 +6092,20 @@ def load_plugins(name, type, namespace):
def traverse_dict(dictn, keys, casesense=True): def traverse_dict(dictn, keys, casesense=True):
if not isinstance(dictn, dict): keys = list(keys)[::-1]
return None while keys:
first_key = keys[0] key = keys.pop()
if isinstance(dictn, dict):
if not casesense: if not casesense:
dictn = {key.lower(): val for key, val in dictn.items()} dictn = {k.lower(): v for k, v in dictn.items()}
first_key = first_key.lower() key = key.lower()
value = dictn.get(first_key, None) dictn = dictn.get(key)
return value if len(keys) < 2 else traverse_dict(value, keys[1:], casesense) elif isinstance(dictn, (list, tuple, compat_str)):
key, n = int_or_none(key), len(dictn)
if key is not None and -n <= key < n:
dictn = dictn[key]
else:
dictn = None
else:
return None
return dictn