gitmyhub

sde

Java ★ 50 updated 14y ago

Structured Data Extractor. An application to extract structured data from web pages. It uses Data Extraction Based on Partial Tree Alignment (DEPTA) method. (UPDATE: I implemented a newer algorithm: https://github.com/seagatesoft/webdext)

No plain-English explanation yet — one is being written right now. Check back in a minute.