TXTD: Difference between revisions
(Created page with "Text data") |
No edit summary |
||
Line 1: | Line 1: | ||
Text data | In the context of ''Tomba! 2: The Evil Swine Return'', a '''TXTD file''' is a data structure embedded within the game's resource files (like the DAT file). It contains '''text data''' used for in-game dialogues, descriptions, or other textual elements. This text is stored in a compressed or encoded format and is often accompanied by metadata that determines how the text is organized, displayed, or interacted with in the game. | ||
=== '''Structure of a TXTD File''' === | |||
The TXTD file is divided into '''two hierarchical layers''': | |||
# '''Master Table''': | |||
#* The top-level structure, pointing to multiple '''entries''' of text data. | |||
#* Each pointer in this table corresponds to a "block" or "group" of related text strings. | |||
#* Contains metadata about the number and location of these text blocks. | |||
# '''Entry Table''': | |||
#* Each entry contains pointers to individual '''text strings''' within the text block. | |||
#* Provides metadata such as the offset of the text, the size, and possibly control codes for text behavior (e.g., color, pauses). | |||
==== '''Detailed Structure''' ==== | |||
===== '''1. Header Section''' ===== | |||
* The file begins with a header containing the following fields: | |||
** '''Master Root Pointer''': Offset to the start of the master table. | |||
** '''Master Entry Count''': Number of entries in the master table. | |||
** Additional padding or unused bytes. | |||
===== '''2. Master Table''' ===== | |||
* A list of pointers to '''entry tables'''. | |||
* Each pointer is represented by a '''relative address''' (offset from the start of the file or current context). | |||
===== '''3. Entry Table''' ===== | |||
* For each master entry, there is a corresponding entry table. | |||
* Contains metadata for individual text strings: | |||
** '''Pointer to Text Data''': Offset of the text string relative to the start of the table. | |||
** '''Extra Field''': May encode additional information about the text, such as speaker or type. | |||
===== '''4. Text Strings''' ===== | |||
* Binary-encoded text data begins at the offsets specified in the entry table. | |||
* Uses a custom encoding scheme to represent characters (as seen in the <code>letters</code> dictionary of the script). | |||
* Ends with a terminator byte (<code>0xFF</code>). | |||
=== '''TXTD Encoding Scheme''' === | |||
# '''Character Representation''': | |||
#* Characters are stored as '''byte values''', with each byte mapping to a specific character or control code. | |||
#* Example: | |||
#** <code>0x41</code> → <code>A</code> | |||
#** <code>0x42</code> → <code>B</code> | |||
#** <code>0xFA</code> → Line break (<code>\n</code>) | |||
#** <code>0xFC</code> → Pause (<code>{$PAUSE}</code>) | |||
# '''Control Codes''': | |||
#* Non-alphanumeric bytes are often used for special formatting or commands. | |||
#* Examples: | |||
#** <code>{$COLOR_F1}</code>: Changes text color. | |||
#** <code>{$END}</code>: Marks the end of a text block. | |||
# '''Termination''': | |||
#* Each text string ends with the byte <code>0xFF</code>, signaling the end of the string. | |||
=== '''Tomba! 2 TXTD File Extraction Script''' === | |||
This script is used for extracting and interpreting '''TXTD files''' from the '''DAT file''' in ''Tomba! 2: The Evil Swine Return''. TXTD files often contain '''text data''', such as in-game dialogues, descriptions, or other textual assets. The script decodes this data using a custom character set and formats it for readability or modification. | |||
=== '''Script Details''' === | |||
==== '''Key Functions''' ==== | |||
# '''<code>preview(DAT, offset)</code>''': | |||
#* Main function for extracting text from the specified DAT file. | |||
#* Takes two arguments: | |||
#** <code>DAT</code>: Path to the DAT file. | |||
#** <code>offset</code>: Offset where the TXTD data begins. | |||
#* Processes the data in two hierarchical layers: | |||
#** '''Master Entries''': Top-level pointers directing to specific text blocks. | |||
#** '''Entry Headers''': Sub-pointers within each master entry that direct to individual text strings. | |||
#* Calls <code>prepareText</code> and <code>getText</code> to decode and format the text. | |||
# '''<code>prepareText(ptr, who, real, par1, par2, num)</code>''': | |||
#* Formats and retrieves text from a given pointer. | |||
#* Skips entries if the pointer is invalid (<code>0xFFFF</code>). | |||
# '''<code>getText(real)</code>''': | |||
#* Converts a sequence of binary data into readable text using the <code>letters</code> dictionary. | |||
#* Iterates until it encounters the terminator byte (<code>0xFF</code>), which signals the end of a text block. | |||
# '''<code>getB(number=1)</code>''': | |||
#* Helper function to read a specified number of bytes from the file and convert them into integers (little-endian format). | |||
==== '''Dictionary: <code>letters</code>''' ==== | |||
The <code>letters</code> dictionary maps hexadecimal values to their corresponding characters or control sequences. Key highlights include: | |||
* '''Alphabet and Symbols''': Maps standard alphanumeric characters (<code>A-Z</code>, <code>a-z</code>, <code>0-9</code>) and punctuation. | |||
* '''Special Characters''': Supports extended characters such as <code>Ä</code>, <code>¥</code>, and <code>…</code>. | |||
* '''Control Codes''': | |||
** <code>{$END}</code>: Signals the end of a text block. | |||
** <code>{$PAUSE}</code>: Inserts a pause in the text. | |||
** <code>{$COLOR_F1}</code>: Changes text color (with <code>{$END_COLOR_F0}</code> to revert). | |||
=== '''Workflow of the Script''' === | |||
# '''Initialize''': | |||
#* Define the path to the DAT file and the offset of the TXTD data. | |||
#* Load the DAT file in binary mode. | |||
# '''Read Master Entries''': | |||
#* Extract master root and the number of master entries. | |||
#* Use pointers in the master headers to locate the start of each text block. | |||
# '''Process Entry Headers''': | |||
#* For each master entry, extract sub-pointers (entry headers). | |||
#* Use these sub-pointers to locate individual text strings. | |||
# '''Decode Text''': | |||
#* Convert binary data into readable text using the <code>letters</code> dictionary. | |||
#* Handle special formatting codes and ensure proper string termination. | |||
# '''Output''': | |||
#* Structure and output the extracted text for further use or modification. |
Revision as of 16:45, 10 January 2025
In the context of Tomba! 2: The Evil Swine Return, a TXTD file is a data structure embedded within the game's resource files (like the DAT file). It contains text data used for in-game dialogues, descriptions, or other textual elements. This text is stored in a compressed or encoded format and is often accompanied by metadata that determines how the text is organized, displayed, or interacted with in the game.
Structure of a TXTD File
The TXTD file is divided into two hierarchical layers:
- Master Table:
- The top-level structure, pointing to multiple entries of text data.
- Each pointer in this table corresponds to a "block" or "group" of related text strings.
- Contains metadata about the number and location of these text blocks.
- Entry Table:
- Each entry contains pointers to individual text strings within the text block.
- Provides metadata such as the offset of the text, the size, and possibly control codes for text behavior (e.g., color, pauses).
Detailed Structure
1. Header Section
- The file begins with a header containing the following fields:
- Master Root Pointer: Offset to the start of the master table.
- Master Entry Count: Number of entries in the master table.
- Additional padding or unused bytes.
2. Master Table
- A list of pointers to entry tables.
- Each pointer is represented by a relative address (offset from the start of the file or current context).
3. Entry Table
- For each master entry, there is a corresponding entry table.
- Contains metadata for individual text strings:
- Pointer to Text Data: Offset of the text string relative to the start of the table.
- Extra Field: May encode additional information about the text, such as speaker or type.
4. Text Strings
- Binary-encoded text data begins at the offsets specified in the entry table.
- Uses a custom encoding scheme to represent characters (as seen in the
letters
dictionary of the script). - Ends with a terminator byte (
0xFF
).
TXTD Encoding Scheme
- Character Representation:
- Characters are stored as byte values, with each byte mapping to a specific character or control code.
- Example:
0x41
→A
0x42
→B
0xFA
→ Line break (\n
)0xFC
→ Pause ({$PAUSE}
)
- Control Codes:
- Non-alphanumeric bytes are often used for special formatting or commands.
- Examples:
{$COLOR_F1}
: Changes text color.{$END}
: Marks the end of a text block.
- Termination:
- Each text string ends with the byte
0xFF
, signaling the end of the string.
- Each text string ends with the byte
Tomba! 2 TXTD File Extraction Script
This script is used for extracting and interpreting TXTD files from the DAT file in Tomba! 2: The Evil Swine Return. TXTD files often contain text data, such as in-game dialogues, descriptions, or other textual assets. The script decodes this data using a custom character set and formats it for readability or modification.
Script Details
Key Functions
preview(DAT, offset)
:- Main function for extracting text from the specified DAT file.
- Takes two arguments:
DAT
: Path to the DAT file.offset
: Offset where the TXTD data begins.
- Processes the data in two hierarchical layers:
- Master Entries: Top-level pointers directing to specific text blocks.
- Entry Headers: Sub-pointers within each master entry that direct to individual text strings.
- Calls
prepareText
andgetText
to decode and format the text.
prepareText(ptr, who, real, par1, par2, num)
:- Formats and retrieves text from a given pointer.
- Skips entries if the pointer is invalid (
0xFFFF
).
getText(real)
:- Converts a sequence of binary data into readable text using the
letters
dictionary. - Iterates until it encounters the terminator byte (
0xFF
), which signals the end of a text block.
- Converts a sequence of binary data into readable text using the
getB(number=1)
:- Helper function to read a specified number of bytes from the file and convert them into integers (little-endian format).
Dictionary: letters
The letters
dictionary maps hexadecimal values to their corresponding characters or control sequences. Key highlights include:
- Alphabet and Symbols: Maps standard alphanumeric characters (
A-Z
,a-z
,0-9
) and punctuation. - Special Characters: Supports extended characters such as
Ä
,¥
, and…
. - Control Codes:
{$END}
: Signals the end of a text block.{$PAUSE}
: Inserts a pause in the text.{$COLOR_F1}
: Changes text color (with{$END_COLOR_F0}
to revert).
Workflow of the Script
- Initialize:
- Define the path to the DAT file and the offset of the TXTD data.
- Load the DAT file in binary mode.
- Read Master Entries:
- Extract master root and the number of master entries.
- Use pointers in the master headers to locate the start of each text block.
- Process Entry Headers:
- For each master entry, extract sub-pointers (entry headers).
- Use these sub-pointers to locate individual text strings.
- Decode Text:
- Convert binary data into readable text using the
letters
dictionary. - Handle special formatting codes and ensure proper string termination.
- Convert binary data into readable text using the
- Output:
- Structure and output the extracted text for further use or modification.