Skip to main content

Section outline

    • Text indexing and IR fundamentals

    • Regex, BeautifulSoup for HTML extraction

    • Textract for PDFs and Word documents

    • SpaCy setup and Jupyter usage

    • Numpy and Pandas for data handling