[1911.02425v1] weg2vec: Event embedding for temporal networks
While for demonstration most of the result in the paper are shown only for the conference and primary school settings, a detailed data description together with the analysis for the other networks are presented in the Supplementary Information (SI)

Network embedding techniques are powerful to capture structural regularities in networks and to identify similarities between their local fabrics. However, conventional network embedding models are developed for static structures, commonly consider nodes only and they are seriously challenged when the network is varying in time. Temporal networks may provide an advantage in the description of real systems, but they code more complex information, which could be effectively represented only by a handful of methods so far. Here, we propose a new method of event embedding of temporal networks, called weg2vec, which builds on temporal and structural similarities of events to learn a low dimensional representation of a temporal network. This projection successfully captures latent structures and similarities between events involving different nodes at different times and provides ways to predict the final outcome of spreading processes unfolding on the temporal structure.
‹Figure 2: Schematic presentation of the methodological pipeline of the presented temporal network embedding method, which takes a temporal network to (a) project it into a weighted event graph; (b) to sample a set of environments for each event; (c) and uses it as input for a Skip-Gram model; (d) to obtain an event embedding of the original network. (Results)Figure 3: 3-dimensional embeddings of the conference and primary school networks. x,y and z axes indicate event coordinates, while colour in panels (a-b) shows the time at which the event occurs and in panels (c-d) mesoscale structure membership found using a tensor factorisation method (see Section ??) respectively set to find 5 and 10 of these structures. Hyper parameters were set to α = 0.5, nb = 10 and s = 10. (Neighbourhood sampling strategy)Figure 4: Entropy values with respect to d number of embedding dimensions for the conference (a) and the primary school (b) networks at α = 0.5. The dash line represents the value (d = 20 and 24 respectively for (a) and (b)) of the optimal embedding dimension in which stability is reached. The blue line and the shaded area represent respectively the average and the variance among the samples we used for the analysis. (Effects of the dimension and of the neighbourhood sampling)Figure 5: R-squared values, r2 , dependency on the nb number (x-axis) and s size (y-axis) of sampled environments. Figure (a) shows results for the conference network and Figure (b) for the primary school network. Colours and z-axis code the obtained average r2 score values for given nb and s parameter pairs computed over 10 realisations. α was fixed to 0.5; we set d = 20 for Figure (a) and d = 24 for Figure (b) see Figure ??. (Parameter dependencies)Figure 6: R-squared values, r2 , as the function of the d number of embedding dimensions (x-axis) and α sampling balance (y-axis) parameters. Figure (a) shows results for the conference network and Figure (b) for the primary school network. Colours and z-axis code the obtained average r2 score values for a given d and α parameter pairs computed over 10 realisations. Other hyper parameters were fixed to nb = 10 and s = 10. (Parameter dependencies)Figure 7: Comparison of STWalk, Online-Node2vec and our embedding methods in predicting spreading outcomes on empirical networks in different settings as (a) conference, (b) hospital, (c) high school, and (d) primary school. Results shown are r2 scored obtained from linear regression on coordinates in embedding spaces with various dimensions computed for each method and empirical temporal networks. (Comparison with other methods)Figure 8: 3-dimensional embeddings of the hospital and high school networks. Colours in panels (a-b) shows the time at which the event occurs and in panels (c-d) it indicates mesoscale structure membership detected by tensor factorisation method with 3 and 6 groups (respectively). Hyper parameters were set to α = 0.5, nb = 10 and s = 10. (Embedding of temporal networks)Figure 9: The r2 correlation score dependency on the nb number and s size of sampled environments. Results are shown for (a) hospital and (b) high school networks. Colours and z-axis code the average r2 score values for given nb and s parameter pairs computed over 10 realisations. Other parameters were fixed to α = 0.5 and (a) d = 14 and (b) d = 26. (Parameter dependence)Figure 10: The r2 correlation score dependency on the d number of dimensions and α sampling balance parameters. Results are shown for the (a) hospital and (b) high school networks. Colours and z-axis code the obtained average r2 score values for a given d and α parameter pairs computed over 10 realisations. Other parameters were fixed to nb = 10 and s = 10. (Parameter dependence)Figure 11: Entropy values with respect to the d number of embedding dimension for the (a) hospital and (b) high school networks, with α = 0.5, s = nb = 10. The dash line indicates the optimal embedding dimension in which stability is reached. The blue line and the shaded area represent respectively the average and the variance among the samples we used for the analysis. (Entropy measures)Figure 12: Epidemic size distribution of the (a) conference, (b) hospital, (c) high school, and (d) primary school original temporal networks. (Epidemic size distribution of original networks and randomized models)Figure 13: Epidemic size distributions on snapshot shuffled RRM networks for (a) conference, (b) hospital, (c) high school, and (d) primary school networks. Colours assign different random realisation of the actual network model. (Epidemic size distribution of original networks and randomized models)Figure 14: Epidemic size distributions on timeline shuffled RRM networks for (a) conference, (b) hospital, (c) high school, and (d) primary school networks. Colours assign different random realisation of the actual network model. (Epidemic size distribution of original networks and randomized models)Figure 15: Epidemic size distributions on link shuffled RRM networks for (a) conference, (b) hospital, (c) high school, and (d) primary school networks. Colours assign different random realisation of the actual network model. (Epidemic size distribution of original networks and randomized models)›