Topic Links 2.2 Archive Fix May 2026

RewriteCond %REQUEST_FILENAME !-f RewriteCond %REQUEST_FILENAME !-d RewriteRule ^t-([0-9]+)(.html)?$ index.php?t=$1 [L,NC,QSA]

We didn't have the source code for the original 2.2 parser. But we had 12,000 archived HTML files and a SQL dump of the original topic map. Here is the fix we built.

First, we needed to brute-force what the 2.2 strings actually meant. By cross-referencing the SQL dump (which had TopicID and LegacyPointer columns), we built a lookup table. Topic Links 2.2 Archive Fix

We discovered the 2.2 format is actually a disguised tuple: 2.2.[CategoryID].[TopicNumber].[Checksum]

We wrote a Python regex to extract the TopicNumber: RewriteCond %REQUEST_FILENAME

import re
pattern = r"2\.2\.\d+\.(\d+)\.\d+"

Because we couldn't rewrite 12,000 HTML files by hand (without breaking existing deep links from Google), we deployed a lightweight Cloudflare Worker / Apache .htaccess rewrite rule.

The logic:

For the 8% of links where the TopicNumber was corrupt, we implemented a fuzzy title search. We took the original anchor text (e.g., "Click here for invoice details") and ran a Levenshtein distance match against all archived page titles.

It's slow. It's ugly. But it works.

Run SQL queries to fix internal links stored in post or thread tables. Always back up your database first.

UPDATE post SET pagetext = REPLACE(pagetext, 
    '/archive/index.php/t-', 
    '/archive/index.php/t-'
); 
-- This is a simplified example. Actual regex might be needed.

For MySQL 8.0+, use REGEXP_REPLACE:

UPDATE post SET pagetext = REGEXP_REPLACE(pagetext, 
    'archive/index.php/t-([0-9]+)\.html\.html', 
    'archive/index.php/t-\\1.html', 
    1, 0, 'i');

LEAVE A REPLY

Please enter your comment!
Please enter your name here