Graph Network for Graph Generation



I am interested in a graph generation through machine learning or some other method.

Hardware consists of wire and transistor and these compose a graph as logic circuit. I want to try to generate neural network hardware described in hardware description language (HDL), from network model description such as using PyTorch, TensorFlow, etc.

There is infrastructure of NNVM and TVM for example to generate program for specific hardwares. Recently they challenge to develop Verilog-HDL generation.

Silicon chip is not capable for entire neural network graph consisting of billion of nodes and edges, so partitioning and scheduling are necessary for the set of such the fragments.

I think graph neural network can work for these, I expect to detect similarity of sub graph pattern (fragments).
Or, neural network architecture search is one of hot topics, so its technique can be applied to my hardware architecture search domain.

Any suggestions are welcome.

Thanks and Regards,


Hi Takano san,

I’m not an expert on graph generation but I’m aware that Deep Mind published a paper dealing with graph networks. You can read a small summary in their official website, and find the official implementation in their repo. I hope you find this useful!


Hi Jabalazs-san

Happy New Year!

Thank you for your quick replying.
I will check the site after clicking the reply.



I quickly read the paper linked from here, is about relational reasoning based on a graph network definition, maybe. They define graph network and its update phases.
However, it does not mention about operation detail, I indeed need to read source code, but I think that this is not related with my ideas.


If possible, I would push my code to GItHub somewhere.



I’m not aware of any work generating neural networks as graphs as you described. What is the size of a general graph you’d like to generate? (number of nodes and edges)

I’m also working on graph generation as a way to generate novel molecules. In this case a chemical molecule is represented as a graph, but those are comparatively small.

Here are some recent publications on this problem

  1. CGVAE by Microsoft Research Github repo
  2. MolGAN by Decao and Kipf Github repo
  3. GraphRNN from Leskovec Lab Arxiv

CGVAE and GraphRNN generate graphs sequentially (ie. adding nodes and edges step by step). In contrast, MolGan generates a graph with a given size in a single step.

According to my understanding, none of these methods are suitable for modeling huge networks with structures.

If you can point me to an example graph, I can think of possible ways to generate.




Thank you for your replying.

I would like to generate neural network as hardware, so, small one is AlexNet for example.
However, it consists of more than million of weights and edges, so partitioning and scheduling is required.


  1. Model size is variable
  2. Connection pattern should be detected
  3. Number of detected patterns is variable
  4. The detected patterns (fragments) should be scheduled for loading it onto a chip and storing to external memory, just swap of necessary fragment(s).

There are several phases

  1. Detection of patterns in a graph based on a critical path hyper parameter
  2. Cost evaluation to schedule the fragments and decide the order to bind onto the chip

For example, full connection layer is a matrix-vector multiplication. Therefore it can be composed multiple dot-product logic circuits. Just replicating the logic circuit on the chip is necessary. In this case, detecting pattern of dot-product is needed, and replication as scheduling is necessary.

I hope that these make hints for you.



There are several challenges in this domain;

  1. RTL (Register Transfer Level) description generation
    Directly generate neural network hardware
  • Recently TVM (a part of ONNX) supports verilog-HDL (Hardware Description Language) generation.
  • LeFlow + LegUp can generate RTL, I tried but the LeFlow has problem to generate.
  • These do not support partitioning and scheduling in now, but definitely necessary these functions.
  1. Program generation for specific processors
  • NNVM and or TVM works for front-end of compilation, and after that your own back end generates program for your own processor.

Approach-1 is studied by universities, but it is not programmable hardware means that update needs full design process.
Approach-2 is major by startups, this needs a transformation of graph to fit onto platform topology.



Related works are;


Sorry, I’m not familiar with HDL generation. From what I understood, I think looking into Network Architecture Search will be more useful. Google’s NASNet learns to generate network architectures maximizing accuracy using reinforcement learning. In each time step it selects an action (eg., adding a 3x3 convolution) and observes a reward (eg., validation accuracy). Later, Google introduced Mnasnet to incorporate latency into the main objective.

Hope this might be helpful.

  1. NASNet
  2. MnasNet



Thank you for your advice.
I read the papers tomorrow.
HDL generation itself is not important, but data flow graph generation is important.

There are two kinds of approach;

  1. Target onto Network Model as Source
    Input source network model and output hardware model (reference method)

  2. Target onto Network Model from Scratch
    No input source, but generate as try-and-error, and output hardware model (scratch method)



I checked, NASNet.

It uses reinforcement learning to decide the updating network model, this is in approach-2 of scratch method.
Probably there is a cycle of update of design by neural network, train the designed network, and evaluate the designed network through its accuracy, and feedback to the design.

Please remember I try to develop neural network hardware based on approach-1 and or 2. This means that there is a reference neural network at least, such as AlexNet, ResNet, VGG, etc. These network should be translated to data flow graph at first, and last, it is translated to HDL or Program.

This implies;

  1. Source network model should be translated to data flow graph
  2. Data flow graph is transformed to fit onto “virtual” platform hardware
  3. The transformed data flow graph must be partitioned to fit onto “physical” platform hardware
  4. Schedule partitioned fragments to minimize execution time

The partitioning must detect similarity of topology in the graph to reuse the topology several time. At this point, deep learning can be applied. Scheduling also based on a cost function, thus some deep learning model can be applied.



Directly synthesizing hardware is to make a data flow graph;
Regarding fully connected layer;
for (i=0; i<I; i++) {
c[i] = b[i];
for (j=0; j<J; j++)
c[i] += w[i][j] * x[j];

The code fragment can use loop unrolling as follows;

for (i=0; i<I; i+=r) {
c[i] = b[i];
for (j=0; j<J; j++)
c[i+0] += w[i+0][j] * x[j];
c[i+1] += w[i+1][j] * x[j];

c[i+r-1] += w[i+r-1][j] * x[j];

where r is resources for computation.
r dot-products with sharing of element x is possible.
In addition, loop-unrolling with index j is also possible, then more detailed data-flow graph is obtained.