\documentclass[12pt]{article} \usepackage[utf8]{inputenc} \usepackage{amsmath} \usepackage{hyperref} \usepackage{natbib} \setlength{\parindent}{0pt} \setlength{\parskip}{0.5\baselineskip} \title{\bf Graph-Structure Discovery in a Logical Probabilistic Model} \author{ Greg Coppola \\ {\em coppola.ai} } \date{\today} \begin{document} \maketitle \section{Abstract} A primary problem with large language models is that they ``hallucinate'' and that they ``do not reason''. We have previously proposed \citep{coppola2024__theory_experiments} that both of these problems can be solved by adopting greater use of ``discrete, probabilistic methods'', in general, and what we have called a {\em Logical Bayesian Network} or a {\em Generative Logical Model}, in particular. A {\em Logical Bayesian Network} shares with other {\em Bayesian Networks} the fact that the {\em structure} of the network is highly variable between domains and must be ``discovered'' as part of the learning process. This gives another perspective on why the large language model has been successful. A neural network can be viewed as a kind of (perhaps non-probabilistic) version of a graphical model in which the actual ``structure'' can be overly connected at first, with certain connections being differentiably ``learned to be ignored'' during training. Historically, this must have been an easier problem to solve first than the non-differentiable, discrete structural growth required for a more traditional Bayesian Network to scale to an ``open domain''. However, to obtain the already argued for benefits of a discrete logical generative model, we continue to want to pursue them. Thus, we survey existing strategies and develop novel strategies for {\em growing} a discrete Bayesian Network, to fit the data. We investigate strategies based on maximization of likelihood, including using expectation-maximization-based techniques, and the use of {\em hidden variables}. We investigate the ``transfer'' of knowledge either from the large language model or from online information using {\em retrieval augmented generation}. We also explore interpolations of these two strategies that we might call {\em hallucinate and test}—i.e., allowing the LLM to ``dream up'' new discrete connections, but then testing these against the data to see if the model really has ``improved''. Finally, we remark on ways in which the structure discovery of a discrete logical model can mirror what is often referred to as ``creativity''. \section{Dynamic Online Project} The dynamically updated content—documents, code, and diagrams—for this project can be found at the {\em GitHub} project at \citep{coppola2025__discover_structure}. We intend to package any applicable contributions as focused contributions to refereed papers. \begin{thebibliography}{99} \bibitem[Coppola, 2024]{coppola2024__theory_experiments} Coppola, G. (2024). \newblock The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model. \newblock arXiv preprint arXiv:2402.06557, \newblock \url{https://arxiv.org/abs/2402.06557}. \bibitem[Coppola, 2025]{coppola2025__discover_structure} Coppola, G. (2025). \newblock Graph-Structure Discovery in a Logical Probabilistic Model. \newblock GitHub repository, \newblock \url{https://github.com/gregorycoppola/discover-structure}. \end{thebibliography} \end{document}