Damir Cavar is a Natural Language Processing and AI, and Classical and Quantum Computing scientist
This is the course page for Topics in Artificial Intelligence / Seminar on Knowledge Graphs, Large Language Models, and Graph-based Reasoning using Agentive AI models by Damir Cavar.
Sections: 10343 or 10496
Instructor: Assoc. Prof. Dr. Damir Cavar
Contact: email, phone
Office hours: Thursday 1-2 PM and by arrangement
Office: Ballantine Hall (BH) 511
Meeting time: Tuesday and Thursday, 2:20 - 3:35 PM
Course website: Assignments, slides, and other material will be posted on Canvas.
Credits: 3
Large Language Models (LLMs) demonstrated by ChatGPT, Gemini, or Claude 4 a high level of sophistication in the language and image processing domain. One of the problematic issues with such models is hallucination. LLMs tend to generate plausible and well-formulated text, references, and data that do not correspond to factual knowledge. Various proposals discuss how approved and valid knowledge can be added to LLMs to minimize hallucinations and update the knowledge of LLMs without retraining them more frequently using the newest content. LLMs seem to provide superior capabilities to process or generate natural language and, to a limited extent, reason over utterances and claims. LLMs also seem able to reason over temporal and event logic in limited ways—these capabilities we want to combine with formal knowledge representations.
Knowledge Graphs, Ontologies, and related representations of semantic properties and concept relations facilitate efficient graph-based storage for processing meaning via entailment and concept hierarchies. Those technologies enable the specification of semantic relations and limitations for precise representation of core aspects of natural language semantics. This includes some possibilities to reason over data and relations. Factual knowledge stored in Knowledge Graphs enables processing descriptions of the world or a specific knowledge domain. Knowledge Graphs are limited in keeping complex event representations and changes of entities and situations over time.
This seminar consists of a series of experiments to test and experiment with:
We will look at implementations of LLMs and experiment with integrating Knowledge Graphs in such LLMs. In addition to that, we will experiment with approaches to generate knowledge representations from structured and unstructured sources, providing access to such models via LLMs.
We are discussing, implementing, and experimenting with general techniques to map knowledge from unstructured sources (text, speech, image, sensory data) to graph representations:
We use graphs as symbolic knowledge representations (or Knowledge Graphs) with RDF, JSON-LD, OWL backends, as well as probabilistic and dynamic networks in hybrid models (symbolic and neural). The complexity of knowledge extraction becomes much higher when including processing implicatures and presuppositions and representing those in graph models.
Our goal is to a.) gain a deep understanding of the mapping from unstructured information (e.g., language, vision) to high-precision graph-based knowledge representations, and b.) generate implicatures and presuppositions from both to be able to extend the logical reasoning capabilities, to c.) explore the limits of hybrid AI and Machine Learning methods on symbolic and probabilistic/dynamic Knowledge Graphs using various approaches to Graph Embeddings, with different graph and Graph Neural Network algorithms. Integrating sophisticated knowledge representations in an LLM environment can significantly facilitate AI systems and provide new reliable reasoning capabilities for data and information in various domains, e.g., medical, cybersecurity, or scientific writing.
Crucial aspects of course outcomes are:
Understand machine learning, computational semantics, and Natural Language Processing (NLP)
Understanding symbolic knowledge representations and computational reasoning
Understand the linguistic annotations, analyses, and outputs that LLMs and ontologies generate
Acquire the skills and ability to develop own models and to tune such methods to apply NLP to entirely new problems and research areas
Understand how Large Language Models and Generative AI work
Understand orchestration of LLMs and Agentive AI architectures with integrated RAGs and Knowledge Representations
Reinforce concepts of programming in Python
Learn to apply well-documented scientific libraries in Python
This course provides an essential platform for further work with LLMs, NLP, and AI.
Grades are based on the following schema:
Note to students: these dates are subject to change with adequate notification via Canvas announcements.
Written reports: due most Fridays
Final project oral report: Tuesday, Dec. 09 and Thursday, Dec. 11 (in lecture)
Final project - written report: Thursday, Dec. 18
There is no required textbook in this seminar.
A recommended textbook is available online free of charge:
Links to an external site.).
Additional books and publications recommended for course reference materials are supplied in class.
No software purchases are required; the course uses free open source software (Python and GitHub).
Students are expected to supply a personal working laptop computer with permissions to install and use the course software.
Barrasa, Jesus and Jim Webber (2023) Building Knowledge Graphs. O’Reilly Media, Inc.
Jakus, Grega, and Veljko Milutinović, Sanida Omerović, Sašo Tomažič (2013) Concepts, Ontologies, and Knowledge Representation. Springer New York, NY.
Staab, Steffen, and Rudi Studer (2009) Handbook on Ontologies. Springer Berlin, Heidelberg
Keet, C. Maria (2020) An Introduction to Ontology Engineering.
Wolfram, Steven (2023) What Is ChatGPT Doing … and Why Does It Work?
Pan, Shirui and Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, Xindong Wu (2023) Unifying Large Language Models and Knowledge Graphs: A Roadmap.
Radford, Alec, and Karthik Narasimhan, Tim Salimans, Ilya Sutskever (2019) Improving Language Understanding by Generative Pre-Training
Floridi, Luciano and Massimo Chiriatti (2020) GPT-3: Its Nature, Scope, Limits, and Consequences. In Minds and Machines, 30:681–694
Thoppilan R. et al. (2022) LaMDA: Language Models for Dialog Applications. Neo4j tutorial.
This syllabus is subject to change and likely will change. All important changes will be made in writing, with ample time for adjustment.
(C) 2025 by Damir Cavar