Novel AI algorithm for efficient domain-specific language adoption - The ESG case

As project leader on a research paper, I developed Pool-based Adapter Active Learning (P-BAAL), a novel algorithm for the efficient domain adoption of Large Language Models (LLMs) towards resource constraint scenarios. The resulting model served as a powerful method for the classification of textual data towards adherence to ESG goals, allowing for rapid analysis of such documents for financial firms, law firms, governments and other potential stakeholder. 

Key Deliverables

  1. Ready-to-use model for the rapid and efficient verification of adherence to ESG goals within textual data of any format
  2. Complete algorithm as GitHub repository, trainable on any potential textual classification dataset.
  3. Set code basis, publicly available for further research purposes and improvements to the original algorithm

3%

Improvement of classification accuracy compared to current state-of-the-art in domain specific modeling

92%

Of documents correctly classified as whether or not they adhere to any of the ESG goals.

+6000

Documents classified, creating a unique ESG textual dataset, utilizable for variety of tasks key to further research of AI in the Sustainability landscape

Key Phases

  1. Dataset collection, classification of textual data according to adherence to ESG goals and analysis of complete dataset for quality verification
  2. Complete algorithm design, mathematical definition and architectural development of the model.
  3. Code development for algorithm architecture, training and evaluation, as well as testing of different hyperparameters for model selection
  4. Setup of final model utilisable for classification of textual data with adherence to ESG goals