[FIX]: Restore XMLTV generation for ATSC EIT/VCT streams and correct EIT bounds checks #1773
+384
−167
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In raising this pull request, I confirm the following (please check boxes):
My familiarity with the project is as follows (check one):
Description
Fixes #1759 - This PR restores functional XMLTV generation for ATSC broadcast streams and adds comprehensive EPG parsing capabilities. ATSC streams with EIT/VCT/ETT tables now generate complete XMLTV output with program titles, descriptions, and extended text metadata.
Problem
The
-xmltvparameter was completely non-functional for ATSC broadcast streams. When processing ATSC transport streams containing valid EPG data (EIT tables), channel information (VCT/TVCT tables), and extended text (ETT tables), CCExtractor would:This made it impossible to extract Electronic Program Guide data from ATSC streams, despite the
-xmltvparameter being specified.Root causes identified:
TS_PMT_MAP_SIZE) were never output to XMLTVCHECK_OFFSETmacro) caused parser failures and potential buffer overrunsSolution
Core Fixes
Fixed EPG output logic (
EPG_output()function)nb_programvalueFixed critical buffer boundary check (
CHECK_OFFSETmacro)<to>in boundary validationif (offset + val < offset_end)(incorrect - allowed overruns)if (offset + (val) > offset_end)(correct - prevents overruns)Extended ATSC table support (
EPG_parse_table()function)New Features
Implemented ATSC ETT (Extended Text Table) parsing
EPG_ATSC_decode_ETT()function to parse ETT table structuresEPG_ATSC_decode_ETT_text()to extract multiple string format extended descriptions<desc>tags in XMLTV output with detailed program informationEnhanced ATSC multiple_string decoder (
EPG_ATSC_decode_multiple_string())event_name(title), second segment →text(subtitle/description)Improved XMLTV output formatting
<desc>tags (correct XMLTV placement)Testing
Tested with sample files provided by @TPeterson94070 in issue #1759:
channel5FullTS.ts- 5 channels with VCT/TVCT tablesch12FullTS.ts- Additional ATSC test casech29FullTS.ts- 5 programs with extended EIT data (Nov 26-28, 2025)Before this PR:
./ccextractor channel5FullTS.ts --xmltv 1.srtfile generatedAfter this PR:
./ccextractor channel5FullTS.ts --xmltv 1.srtAND.xmlfiles generated successfullyts-meta-idvalues matching EIT event IDsSample XMLTV output (after ETT parsing):
Known Limitations
ATSC date/time conversion issues: ATSC date/time conversion occasionally produces incorrect years in some streams (pre-existing behavior).
Channel naming: XMLTV output uses numeric channel IDs (source_id) instead of human-readable names. VCT short_name and major/minor channel numbers are not currently mapped to XMLTV display-name elements.
Orphaned events: Some EIT events may appear under channel="0" when their service_id does not match any VCT-defined program. This occurs with malformed streams or when VCT data is incomplete.
These three accuracy issues mentioned above (incorrect dates, channel naming, orphaned programs) are data quality problems that existed in the codebase previously and are not directly caused by or related to the primary bug fix in this PR.
I believe these should be addressed in follow-up PRs for better separation of concerns. However, if maintainers prefer these issues to be fixed in this PR, I'm happy to include them.