Latest revision as of 15:39, 10 January 2025

Tomba 2 Asset Extractor

Overview

This script is a specialized tool designed to extract and organize assets from the PlayStation game Tomba! 2: The Evil Swine Return. The script processes the game's .IDX, .DAT, and .IMG files to extract and categorize data such as sprites, 3d models for assets, level geometry, text, animation types 1 2 3, level collision, drawmaps, backgrounds. This is particularly useful for modding, asset recovery, or archival purposes.

The script relies on Python and uses the struct module to interpret binary data formats. Users can customize the script to specify input (CDpath) and output (outfolder) directories for asset extraction.

Prerequisites

Python installed on your system.
A copy of the game's .IDX, .DAT, and .IMG files.
Basic understanding of binary file structures and file system operations.

Usage

Configure Input and Output Paths

Edit the following lines in the script to match the locations of your input files and desired output directory:

CDpath = "/path/to/TOMBA2/CD"  # Location of the .IDX, .DAT, and .IMG files 
outfolder = "/path/to/outputfolder"  # Destination for extracted files

Execute the Script

Run the script using Python:

python Tomba2Ex.py

The script will read the .IDX file to determine asset locations and extract data from the .DAT and .IMG files into organized subdirectories within the outfolder.

How It Works

Creating Directories

The script creates a structured output directory with the following organization:

outputfolder/
├── chunk_00/
│   ├── 00_sdats/
│   │   ├── 0000-1234.sdat
│   │   └── 00_pointers.txt
│   ├── 00_vrams/
│   │   ├── 0000-1234.cvram
│   │   ├── 00_shards/
│   │   │   ├── 00-0.shard
│   │   │   └── ...
│   │   └── 00.vram
│   └── 00_trail/
│       ├── 1234-5678.bin
│       └── ...
├── chunk_01/
└── ...

Data Decoding with Tuplify

The tuplify function splits a 32-bit integer into two components:

dat_id: The higher 8 bits.
dat_ptr: The lower 24 bits.

def tuplify(item):    
    dat_id = item >> 24
    dat_ptr = item & 0x00FFFFFF
    return (dat_id, dat_ptr)

This aids in interpreting asset pointers.

Reading Chunks

The .IDX file is divided into fixed-size chunks. For each chunk, the script extracts metadata, pointers, and actual data:

for chunk_index in range(int(os.path.getsize(idxpath) / chunk_size)):
    print(f"Reading Chunk index {chunk_index:02X}...")
    ...

Chunk Metadata

Chunk metadata includes:

img_start and img_end: Texture data range in .IMG.
dat_start and dat_end: Asset data range in .DAT.
pointer_amount: Number of pointers for this chunk.

Reading Asset Data

Data is read from .IMG and .DAT using the ranges defined in the chunk metadata.

Handling Special Data Types

SDAT (Data) Pointers

SDAT data includes pointer structures interpreted as tuples using the tuplify function. The extracted pointers are saved in a text file.

out_sdat_info.write(f"ID: {sdat_pointers[i][0]:02X} | Pointer: {sdat_pointers[i][1]:04X}\n")

VRAM and Texture Shards

The .IMG file contains VRAM data structured into texture "shards." Each shard corresponds to a texture fragment with its own metadata:

x, y: Coordinates within VRAM.
w, h: Dimensions of the texture.

Shards are recombined to form complete VRAM pages:

with open(imgdest + f"/{chunk_index:02X}.vram", "w+b") as vram:
    vram.seek(0x100000 - 1)
    vram.write(b"\0")
    ...

Trail Data

Trail data is the "trailer" section at the end of each chunk, containing additional asset ranges. The script identifies these ranges and extracts the corresponding data.

for t in range(0, len(traildata), 2):
    dat_trail_start, dat_trail_end = traildata[t], traildata[t+1]
    ...

@@ Line 1: / Line 1: @@
 = '''Tomba 2 Asset Extractor''' =
-LINK TO SCRIPT
+Link to [https://drive.google.com/file/d/1-Rr2GQL00wug4x4ABgSIZWPGqTUwhyys/view?usp=drive_link Tomba2Ex.py]
 == '''Overview''' ==
@@ Line 11: / Line 11: @@
 * Python installed on your system.
-* A copy of the game's <code>.IDX</code>, <code>.DAT</code>, and <code>.IMG</code> files.
+* A copy of the game's <code>[[TOMBA2.IDX|.IDX]]</code>, <code>[[TOMBA2.DAT|.DAT]]</code>, and <code>[[TOMBA2.IMG|.IMG]]</code> files.
 * Basic understanding of binary file structures and file system operations.
@@ Line 22: / Line 22: @@
   <code>CDpath = "/path/to/TOMBA2/CD"  # Location of the .IDX, .DAT, and .IMG files</code>
   <code>outfolder = "/path/to/outputfolder"  # Destination for extracted files</code>
-Ensure the paths do not include a trailing forward-slash (<code>/</code>).
 === '''Execute the Script''' ===
 Run the script using Python:
-  <code>python tomba2_extractor.py</code>
+  <code>python Tomba2Ex.py</code>
-The script will read the <code>.IDX</code> file to determine asset locations and extract data from the <code>.DAT</code> and <code>.IMG</code> files into organized subdirectories within the <code>outfolder</code>.
+The script will read the <code>[[TOMBA2.IDX|.IDX]]</code> file to determine asset locations and extract data from the <code>[[TOMBA2.DAT|.DAT]]</code> and <code>[[TOMBA2.IMG|.IMG]]</code> files into organized subdirectories within the <code>outfolder</code>.
 ----
@@ Line 63: / Line 61: @@
 === '''Reading Chunks''' ===
-The <code>.IDX</code> file is divided into fixed-size chunks. For each chunk, the script extracts metadata, pointers, and actual data:
+The <code>[[TOMBA2.IDX|.IDX]]</code> file is divided into fixed-size chunks. For each chunk, the script extracts metadata, pointers, and actual data:
   for chunk_index in range(int(os.path.getsize(idxpath) / chunk_size)):
       print(f"Reading Chunk index {chunk_index:02X}...")
@@ Line 71: / Line 69: @@
 Chunk metadata includes:
-* '''img_start''' and '''img_end''': Texture data range in <code>.IMG</code>.
+* '''img_start''' and '''img_end''': Texture data range in <code>[[TOMBA2.IMG|.IMG]]</code>.
-* '''dat_start''' and '''dat_end''': Asset data range in <code>.DAT</code>.
+* '''dat_start''' and '''dat_end''': Asset data range in <code>[[TOMBA2.DAT|.DAT]]</code>.
 * '''pointer_amount''': Number of pointers for this chunk.
 ==== '''Reading Asset Data''' ====
-Data is read from <code>.IMG</code> and <code>.DAT</code> using the ranges defined in the chunk metadata.
+Data is read from <code>[[TOMBA2.IMG|.IMG]]</code> and <code>[[TOMBA2.DAT|.DAT]]</code> using the ranges defined in the chunk metadata.
 ----
@@ Line 86: / Line 84: @@
 === '''VRAM and Texture Shards''' ===
-The <code>.IMG</code> file contains VRAM data structured into texture "shards." Each shard corresponds to a texture fragment with its own metadata:
+The <code>[[TOMBA2.IMG|.IMG]]</code> file contains VRAM data structured into texture "shards." Each shard corresponds to a texture fragment with its own metadata:
 * '''x, y''': Coordinates within VRAM.

Tomba 2 Extraction Script: Difference between revisions

Latest revision as of 15:39, 10 January 2025

Contents

Tomba 2 Asset Extractor

Overview

Prerequisites

Usage

Configure Input and Output Paths

Execute the Script

How It Works

Creating Directories

Data Decoding with Tuplify

Reading Chunks

Chunk Metadata

Reading Asset Data

Handling Special Data Types

SDAT (Data) Pointers

VRAM and Texture Shards

Trail Data

Navigation menu

Tomba 2 Extraction Script: Difference between revisions

Latest revision as of 15:39, 10 January 2025

Tomba 2 Asset Extractor

Overview

Prerequisites

Usage

Configure Input and Output Paths

Execute the Script

How It Works

Creating Directories

Data Decoding with Tuplify

Reading Chunks

Chunk Metadata

Reading Asset Data

Handling Special Data Types

SDAT (Data) Pointers

VRAM and Texture Shards

Trail Data

Navigation menu

Search