fbpx

Efficient Information Extraction using LLM 

Learn how to extend LLMs’ capabilities to enable efficient information retrieval from PDF documents

  • Schedule

    28 – 30 May 2024

    18.30 – 21.30 (WIB)

  • Online-Interactive Learning

    Via Zoom

  • Investment

    Rp. 1.500.000

Course Summary

PDF file reports, as a form of written document, are used in many sectors. In practice, these reports often contain very detailed and comprehensive information. This makes the documents tend to be thick and complex, making it challenging to find specific information. The thickness and complexity of these reports are often due to the need to cover all relevant aspects and present data and analysis comprehensively to ensure that readers gain a complete understanding of the subject at hand.

With the development of artificial intelligence (AI) and machine learning (ML) technologies, we can streamline the search for information contained in tens or even hundreds of pages of PDF reports using large language models (LLMs). In this 3-day workshop, you will gain a brand new understanding of how we can extend LLMs’ capabilities to enable efficient information retrieval from PDF documents. Starting with plain PDF documents, you will learn how to process these documents, serve them as additional contexts for LLMs, and eventually, this will answer all of your for the documents!

NOTE: This workshop will be delivered in Bahasa Indonesia.

Learning Outcomes

Upon completion of this workshop, you will be able to:

  • Master the underlying basics of LLM.
  • Implement LLM for real-world applications using LangChain.
  • Understand the workflow of providing additional contexts for LLM.
  • Develop skills in advanced document-handling techniques for information retrieval using LLM.

Syllabus

  • Introduction to Python for data science
  • Working with Python environment
  • Python fundamental data types and data structures
  • Understanding looping concept in Python
  • Understanding the creation of Python function
  • Understanding the usage of Python libraries
  • The concept of generative AI 
  • LLM as generative AI 
  • Transformer architecture in a nutshell
  • LLM capability, limitation, and consideration
  • The big picture of LangChain concept and component
  • API concept and setting for LangChain usage
  • Demonstration of LLM usage with LangChain
  • The big picture of LangChain concept and component
  • API concept and setting for LangChain usage
  • Demonstration of LLM usage with LangChain
  • The concept of RAG (retrieval augmented generation)
  • Loading PDF documents using LangChain
  • The concept of embedding for PDF documents
  • Storing the embedding using a vector database
  • Prompt creation for Q&A and summarization cases
  • Employing LLM for information retrieval
  • The big picture of LangChain concept and component
  • API concept and setting for LangChain usage
  • Demonstration of LLM usage with LangChain

STUDENT TESTIMONIALS

This testimonial video is taken after our previous Online Data Science Series: Time Series Analysis for Business Forecasting.

LEARN FROM ANYWHERE

Our learning format is online-interactive, you will feel the interactive experience as if you were present in a physical classroom. You can access the class using your Zoom account on pre-defined dates.

  • LEARN AT YOUR OWN PACE

    Zoom recording, course Books (PDF & HTML files), the dataset for practice, reference notes, and working files are accessible through our Learning Management System account.

  • PROOF YOUR MASTERY

    Show current and prospective employers of your mastery in computer vision with a signed certificate of completion.

  • CONNECT WITH LIKE MINDED PEOPLE

    Be a part of our data-passionate community with 5000+ members and 1000+ alumni.

FOR ABSOLUTE BEGINNERS

Workshops in this series are tailored to casual programmers and non-programmers that are taking their first steps into data science. It assumes no prior knowledge or academic background, and attendees will be introduced to the beautiful art of writing R / Python code to produce data visualization and build machine learning models. The workshop has a gentle learning slope that is designed with non-technical professionals and academics in mind.

Yes, you can still attend the workshop as it is a beginner-friendly workshop.

Our system will send you an email containing a link and details to join a Google Classroom.

Online learning will be conducted via Zoom.us, Link to join the Zoom Class will be announced via Google Classroom.

Learning materials can be obtain via Google Classroom

Yes, you will receive a certificate of completion.

YOUR INSTRUCTOR

Saskia Dwi Ulfah

Saskia Dwi Ulfah, a Data Science Instructor at Algoritma Data Science School, has nearly two years of experience in data science and analytics, with a strong proficiency in Python, SQL, and R. Her expertise extends to deep learning and managing end-to-end machine learning projects. She has made significant contributions to collaborative projects in Omdena’s open data science initiatives and developed an advanced face anti-spoofing algorithm, enhancing the security of face recognition systems.

Saskia’s diverse skills are further showcased in her high-quality corporate training for clients like XL Axiata. Her commitment and extensive knowledge in data science not only mark her as an invaluable asset but also ensure she provides an enriching and comprehensive learning experience.