Extracting Rules from Neural Networks with Partial Interpretations
Journal article, Peer reviewed
Published version

View/ Open
Date
2022Metadata
Show full item recordCollections
- Department of Informatics [882]
- Registrations from Cristin [8669]
Original version
Proceedings of the Northern Lights Deep Learning Workshop. 2022, 3. 10.7557/18.6301Abstract
We investigate the problem of extracting rules, expressed in Horn logic, from neural network models. Our work is based on the exact learning model, in which a learner interacts with a teacher (the neural network model) via queries in order to learn an abstract target concept, which in our case is a set of Horn rules. We consider partial interpretations to formulate the queries. These can be understood as a representation of the world where part of the knowledge regarding the truthness of propositions is unknown. We employ Angluin’s algorithm for learning Horn rules via queries and evaluate our strategy empirically.