California Law: Recovering Meaning and Metadata with RegEx

In a previous post, I mentioned some of the challenges in recovering meaningful structural information (titles, paragraphs) from pdfs, and why government entities should retain this information when they publish electronic documents.I'll have more to say in future posts about what information is important to retain, which, at a minimum, should include document structure (titles, sections,... [Read More]