I need to use Gremlin in a Jupyter IPython notebook to query a graph database that is kept on AWS Neptune. I'm using Neptune ML's graph-neural network functionalities to perform link prediction. I want to know specifically which nodes of "TYPE X" are connected to the ones that are saved in my variable "id variable".
My query looks like this:
%%gremlin
g.with("Neptune#ml.endpoint","${endpoint}").
V(${id_variable}).
project('name', 'related to').
by('name').
by( out('RELATED_TO').with("Neptune#ml.prediction").
hasLabel('TYPE_X').values('name') ).
order(local).by(keys, desc)
which returns the following output:
{'name': 'AANAT', 'related to': 'WDR7'}
{'name': 'ACACA', 'related to': 'BTN1A1'}
{'name': 'ACTA1', 'related to': 'MDH'}
{'name': 'ALAS1', 'related to': 'WDR7'}
{'name': 'ALAS2', 'related to': 'TAC3'}
{'name': 'ALDH2', 'related to': 'SOCS2'}
{'name': 'ALDOA', 'related to': 'PRKAB2'}
{'name': 'AKR1B1', 'related to': 'ODF2L'}
{'name': 'ALOX15', 'related to': 'BMP15'}
My problem is that this output is showed as embedded in the output of the notebook cell; however, I would like either to assign it to a variable or store it into a file, as a JSON for instance. In fact, I cannot do variable assignment with the %%gremlin cell magic, and so far I have not found any way to write the output to a file.
Please note that I was not able to run this query in a normal .py script by means of the gremlin_python library, as it does not seem to support the ML functionalities of Neptune (specifically, it throws an error on the .with("Neptune#ml.endpoint","${endpoint}") syntax).
Any suggestion is more than welcome!
Thank you in advance