adding support for Drop frame#354
Conversation
|
Thanks, this looks good. |
|
############################################################################### ################################################################################################# Source VTT file : WEBVTT 0 1 2 3 4 5 6 7 8 9 10 11 With Drop Frame Enabled : ################################################## 00:00:00;18 94ae 94ae 9420 9420 94d0 5768 e56e 20e9 f420 e3ef 6de5 7320 f4ef 20e6 e96e 64e9 6e67 20f4 68e5 9470 ef6e e52c 942f 942f 00:00:02;12 94ae 94ae 9420 9420 94d0 4920 61ec f761 7973 2073 6179 20f4 6861 f420 e9e6 20f4 68e5 7920 e361 6e80 9470 ecef 76e5 2079 ef75 2061 f420 79ef 75f2 206d e573 73e9 e573 f42c 942f 942f 00:00:06;23 94ae 94ae 9420 9420 9470 e361 ec6d 2079 ef75 2061 f420 79ef 75f2 206d efef 64e9 e573 f42c 942f 942f 00:00:08;16 94ae 94ae 9420 9420 94d0 616e 6420 ec61 7567 6820 f7e9 f468 2079 ef75 2061 f420 79ef 75f2 9470 f175 e9f2 6be9 e573 f42c 942f 942f 00:00:10;29 94ae 94ae 9420 9420 94d0 79ef 75a7 76e5 2070 f2ef 6261 62ec 7920 e6ef 756e 6420 79ef 75f2 9470 70e5 f273 ef6e ae80 942f 942f 00:00:13;10 94ae 94ae 9420 9420 9470 4f6e 20f4 ef64 6179 a773 20e3 6173 e52c 942f 942f 00:00:14;10 94ae 94ae 9420 9420 9470 cdf2 ae80 942f 942f 00:00:14;21 94ae 94ae 9420 9420 94d0 4368 6170 6d61 6e20 7361 7973 2068 e520 f468 ef75 6768 f420 68e5 20e6 ef75 6e64 9470 f468 e520 31b3 2079 e561 f273 2061 67ef 20f7 68e5 6e20 68e5 206d e5f4 20cd f2ae 942f 942f 00:00:19;03 94ae 94ae 9420 9420 9470 c7ef f264 ef6e ae80 942f 942f 00:00:19;16 94ae 94ae 9420 9420 94d0 c275 f420 6eef f720 62e5 ece9 e576 e573 20f4 6861 f420 6d75 73f4 2068 6176 e580 9470 61ec ec20 62e5 e56e 2061 2064 f2e5 616d ae80 942f 942f 00:00:22;21 94ae 94ae 9420 9420 9470 cdf2 ae80 942f 942f 00:00:23;02 94ae 94ae 9420 9420 94d0 4368 6170 6d61 6e20 7361 7973 206e eff7 20f4 6861 f420 68e5 a773 9470 e6e9 6e61 ecec 7920 61f7 616b e52c 942f 942f 00:00:24;19 94ae 94ae 9420 9420 1370 68e5 20f2 e561 ece9 7ae5 7320 68e5 a773 2062 e5e5 6e20 73f4 f275 6e67 94d0 61ec ef6e 6720 e6ef f220 79e5 61f2 7320 616e 6420 e973 2064 e56d 616e 64e9 6e67 9470 cdf2 ae80 942f 942f 00:00:29;24 94ae 94ae 9420 9420 9470 43ef f264 2070 f2ef 76e5 2068 e973 20ec ef76 e580 942f 942f 00:00:31;06 94ae 94ae 9420 9420 94d0 6279 2070 f2ef 70ef 73e9 6e67 206d 61f2 f2e9 6167 e520 f4ef 2068 e96d 9470 f4ef 6461 7980 942f 942f 00:00:33;16 94ae 94ae 9420 9420 94d0 eff2 2070 f2e5 7061 f2e5 20e6 eff2 20f4 68e5 20f2 e5ec 61f4 e9ef 6e73 68e9 7080 9470 f4ef 2062 e520 ef76 e5f2 ae80 942f 942f 00:00:36;19 94ae 94ae 9420 9420 9470 4ce5 f4a7 7320 68e5 61f2 20f4 68e5 e9f2 20e3 6173 e5ae 942f 942f 00:00:38;11 942c 942c 00:00:40;24 94ae 94ae 9420 9420 9470 5468 e520 43ef 75f2 f420 e973 206e eff7 20e9 6e20 73e5 7373 e9ef 6eae 942f 942f 00:00:42;12 94ae 94ae 9420 9420 94d0 5468 e520 c8ef 6eef f261 62ec e520 4a75 6467 e520 d3f4 61f2 9470 70f2 e573 e964 e96e 67ae 942f 942f 00:00:45;03 942c 942c 00:00:49;21 94ae 94ae 9420 9420 9470 d9ef 75f2 20c8 ef6e eff2 2c80 942f 942f 00:00:50;06 94ae 94ae 9420 9420 94d0 f468 e973 20e9 7320 f468 e520 e361 73e5 20ef e620 4368 6170 6d61 6e80 9470 76e5 f273 7573 20c7 eff2 64ef 6eae 942f 942f 00:00:51;29 94ae 94ae 9420 9420 9470 5468 616e 6b20 79ef 7520 76e5 f279 206d 75e3 682c 942f 942f 00:00:53;08 94ae 94ae 9420 9420 9470 cdf2 ae80 942f 942f 00:00:53;15 94ae 94ae 9420 9420 9470 4368 6170 6d61 6eae 942f 942f 00:00:53;29 94ae 94ae 9420 9420 9470 cdf2 ae80 942f 942f 00:00:54;08 94ae 94ae 9420 9420 9470 c7ef f264 ef6e ae80 942f 942f With Non Drop Frame : ################################################## Scenarist_SCC V1.0 00:00:00:17 94ae 94ae 9420 9420 94d0 5768 e56e 20e9 f420 e3ef 6de5 7320 f4ef 20e6 e96e 64e9 6e67 20f4 68e5 9470 ef6e e52c 942f 942f 00:00:02:11 94ae 94ae 9420 9420 94d0 4920 61ec f761 7973 2073 6179 20f4 6861 f420 e9e6 20f4 68e5 7920 e361 6e80 9470 ecef 76e5 2079 ef75 2061 f420 79ef 75f2 206d e573 73e9 e573 f42c 942f 942f 00:00:06:22 94ae 94ae 9420 9420 9470 e361 ec6d 2079 ef75 2061 f420 79ef 75f2 206d efef 64e9 e573 f42c 942f 942f 00:00:08:15 94ae 94ae 9420 9420 94d0 616e 6420 ec61 7567 6820 f7e9 f468 2079 ef75 2061 f420 79ef 75f2 9470 f175 e9f2 6be9 e573 f42c 942f 942f 00:00:10:28 94ae 94ae 9420 9420 94d0 79ef 75a7 76e5 2070 f2ef 6261 62ec 7920 e6ef 756e 6420 79ef 75f2 9470 70e5 f273 ef6e ae80 942f 942f 00:00:13:09 94ae 94ae 9420 9420 9470 4f6e 20f4 ef64 6179 a773 20e3 6173 e52c 942f 942f 00:00:14:09 94ae 94ae 9420 9420 9470 cdf2 ae80 942f 942f 00:00:14:20 94ae 94ae 9420 9420 94d0 4368 6170 6d61 6e20 7361 7973 2068 e520 f468 ef75 6768 f420 68e5 20e6 ef75 6e64 9470 f468 e520 31b3 2079 e561 f273 2061 67ef 20f7 68e5 6e20 68e5 206d e5f4 20cd f2ae 942f 942f 00:00:19:02 94ae 94ae 9420 9420 9470 c7ef f264 ef6e ae80 942f 942f 00:00:19:15 94ae 94ae 9420 9420 94d0 c275 f420 6eef f720 62e5 ece9 e576 e573 20f4 6861 f420 6d75 73f4 2068 6176 e580 9470 61ec ec20 62e5 e56e 2061 2064 f2e5 616d ae80 942f 942f 00:00:22:21 94ae 94ae 9420 9420 9470 cdf2 ae80 942f 942f 00:00:23:02 94ae 94ae 9420 9420 94d0 4368 6170 6d61 6e20 7361 7973 206e eff7 20f4 6861 f420 68e5 a773 9470 e6e9 6e61 ecec 7920 61f7 616b e52c 942f 942f 00:00:24:19 94ae 94ae 9420 9420 1370 68e5 20f2 e561 ece9 7ae5 7320 68e5 a773 2062 e5e5 6e20 73f4 f275 6e67 94d0 61ec ef6e 6720 e6ef f220 79e5 61f2 7320 616e 6420 e973 2064 e56d 616e 64e9 6e67 9470 cdf2 ae80 942f 942f 00:00:29:24 94ae 94ae 9420 9420 9470 43ef f264 2070 f2ef 76e5 2068 e973 20ec ef76 e580 942f 942f 00:00:31:06 94ae 94ae 9420 9420 94d0 6279 2070 f2ef 70ef 73e9 6e67 206d 61f2 f2e9 6167 e520 f4ef 2068 e96d 9470 f4ef 6461 7980 942f 942f 00:00:33:15 94ae 94ae 9420 9420 94d0 eff2 2070 f2e5 7061 f2e5 20e6 eff2 20f4 68e5 20f2 e5ec 61f4 e9ef 6e73 68e9 7080 9470 f4ef 2062 e520 ef76 e5f2 ae80 942f 942f 00:00:36:19 94ae 94ae 9420 9420 9470 4ce5 f4a7 7320 68e5 61f2 20f4 68e5 e9f2 20e3 6173 e5ae 942f 942f 00:00:38:10 942c 942c 00:00:40:23 94ae 94ae 9420 9420 9470 5468 e520 43ef 75f2 f420 e973 206e eff7 20e9 6e20 73e5 7373 e9ef 6eae 942f 942f 00:00:42:12 94ae 94ae 9420 9420 94d0 5468 e520 c8ef 6eef f261 62ec e520 4a75 6467 e520 d3f4 61f2 9470 70f2 e573 e964 e96e 67ae 942f 942f 00:00:45:02 942c 942c 00:00:49:20 94ae 94ae 9420 9420 9470 d9ef 75f2 20c8 ef6e eff2 2c80 942f 942f 00:00:50:05 94ae 94ae 9420 9420 94d0 f468 e973 20e9 7320 f468 e520 e361 73e5 20ef e620 4368 6170 6d61 6e80 9470 76e5 f273 7573 20c7 eff2 64ef 6eae 942f 942f 00:00:51:28 94ae 94ae 9420 9420 9470 5468 616e 6b20 79ef 7520 76e5 f279 206d 75e3 682c 942f 942f 00:00:53:07 94ae 94ae 9420 9420 9470 cdf2 ae80 942f 942f 00:00:53:15 94ae 94ae 9420 9420 9470 4368 6170 6d61 6eae 942f 942f 00:00:53:29 94ae 94ae 9420 9420 9470 cdf2 ae80 942f 942f |
| just carried over when implementing positioning. | ||
| """ | ||
|
|
||
| import os |
There was a problem hiding this comment.
The os module is imported but never used anywhere in the file. Please remove it
| # Only support one language. | ||
| lang = list(caption_set.get_languages())[0] | ||
| captions = caption_set.get_captions(lang) | ||
|
|
There was a problem hiding this comment.
Use is instead of == for boolean comparisons.
should be if self.drop_frame is True / elif self.drop_frame is False (or simply if
self.drop_frame / else).
| for index, (code, start, end) in enumerate(codes): | ||
| code_words = len(code) / 5 + 8 | ||
| code_time_microseconds = code_words * MICROSECONDS_PER_CODEWORD | ||
| # code_words = len(code) / 5 + 8 |
|
|
||
| if not self.drop_frame: | ||
| # bump by one codeword if same HH:MM:SS:FF as last | ||
| def _ndf_frames(us: int) -> int: |
There was a problem hiding this comment.
Please move the function outside the loop.
Also, this uses int() but _format_timestamp_ndf uses math.floor()
to mirror _format_timestamp_ndf both should use math.floor()
| for row, line in enumerate(lines): | ||
| row += 16 - len(lines) | ||
| # Move cursor to column 0 of the destination row | ||
| for _ in range(2): |
There was a problem hiding this comment.
SCC spec requires control codes (including PACs) to be sent in pairs for error resilience. Therefore we should revert this to range(2) as it was.
| # first chunk at 'start' | ||
| _keep = 80 - len(_prefix) - len(_suffix) # 74 | ||
| _first = _prefix + _code_tokens[:_keep] + _suffix | ||
| if self.drop_frame: |
There was a problem hiding this comment.
if self.drop_frame:
self._format_timestamp_df(x)
else:
self._format_timestamp_ndf(x)
sequence appears 6 times.
maybe replace with a helper function like:
def _format_timestamp(self, us):
return self._format_timestamp_df(us) if self.drop_frame else self._format_timestamp_ndf(us)
That would turn each 4-line block into just:
time_to_format = self._format_timestamp(time_to_format)
| _suffix = ["942f", "942f"] | ||
| _code_tokens = code.split() | ||
| _total = len(_prefix) + len(_code_tokens) + len(_suffix) | ||
| if _total > 80: |
There was a problem hiding this comment.
Define number 80 as a named constant with rationale for the limit.
An edge case that's unlikely in practice:
If a caption has, say, 150 code tokens:
- First chunk: 4 + 74 + 2 = 80 — fine
- Second chunk: 4 + 76 + 2 = 82 — exceeds 80
So, any caption with more than 148 code tokens (74×2) would produce a second chunk that violates the same 80-token limit.
I'd say it's worth a low-priority mention — something like "the split
only handles 2 chunks"
| last_emitted_frames = _ndf_frames(start) | ||
|
|
||
| # ---- MINIMAL SPLIT (only if >80 tokens) ---- | ||
| _prefix = ["94ae", "94ae", "9420", "9420"] |
There was a problem hiding this comment.
_prefix/_suffix recreated every iteration
These are constants, move outside the loop.
|
Please resolve the inline comments and merge main into your branch so the PR is mergeable. We'll also need additional tests covering:
But those can come as a follow-up — let's get the current issues fixed first. |
|
|
||
| class SCCWriter(BaseWriter): | ||
| def __init__(self, *args, **kw): | ||
| def __init__(self, *args, drop_frame=True, **kw): |
There was a problem hiding this comment.
before this PR, _format_timestamp produced non-drop-frame timecodes .
With drop_frame=True as the default, anyone using
SCCWriter() without arguments now gets different output — that's a silent breaking change.
change drop_frame=False as the default to preserve the existing behaviour.
Added DROP frame support