Thursday, September 1, 2016

How to Succinctly Visualize the BehaviorSpace from NetLogo using Python

When I first programmed a heuristic rendition of Sugarcape, I was impressed by the results. Time and again, they were consistent across the behavior-space. But how does one convey results that take many hours to observe into simple visualizations? Is it possible to condense the data in such a manner?

First, I needed to generate data that could be read easily by Python. NetLogo's standard data generator does not do this well. Here is the code that I used instead:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
to setup
. . . 
  prep-csv-name
  reset-ticks
end

. . . 

to go
. . . 
  write-csv csv-name final-output
  tick
end

. . . 

to prepare-behavior-space-output
  set final-output (list
    total-sugar
    total-water
    mean-price-current-tick
    price-variance
    population
    
    basic-only
    basic-herder
    basic-arbitrageur
    basic-herder-arbitrageur
    switcher-only
    switcher-herder
    switcher-arbitrageur
    switcher-herder-arbitrageur

    current-percent-basic
    current-percent-arbitrageur
    current-percent-herder
    current-percent-switcher
    wealth-basic-only
    wealth-basic-herder
    wealth-basic-arbitrageur
    wealth-basic-herder-arbitrageur
    wealth-switcher-only
    wealth-switcher-herder
    wealth-switcher-arbitrageur
    wealth-switcher-herder-arbitrageur
    sugar-consumed-this-tick
    water-consumed-this-tick
    distance-from-equilibrium-price
    mean-price-50-tick-average
    mean [endogenous-rate-of-price-change] of turtles
    )
end

to write-csv [ #filename #items ]
  ;; #items is a list of the data (or headers!) to write.
  if is-list? #items and not empty? #items
  [ file-open #filename
    ;; quote non-numeric items
    set #items map quote #items
    ;; print the items
    ;; if only one item, print it.
    ifelse length #items = 1 [ file-print first #items ]
    [file-print reduce [ (word ?1 "," ?2) ] #items]
    ;; close-up
    file-close
  ]
end

to prep-csv-name
  set csv-name "ssugarscapeLocalTradeAllClassesFinal.csv"
  set csv-name replace-item 0 csv-name (word behaviorspace-run-number)
end

to-report quote [ #thing ]
  ifelse is-number? #thing
  [ report #thing ]
  [ report (word "\"" #thing "\"") ]
end

This creates csvs for each run that print out the value of objects described in and in the order presented in the list. This order is important as the same ordering is used in Python when I collect data from the csvs.

The setup looked like this:


[Late Add] I used 8 different settings for sugar and water metabolism. this is reflected in the python code under "num_x" and "num_y". 64 settings were used. Each setting was run 10 times to produce 640 runs, each of whose length were 15000 periods.

Below is the Python code that I used to generate visualizations that look like this (which were used in this article):

Each column and row show a pair of values representing the rate of consumption for each by the agents. In this case, I was capturing the change in wealth per capita, defined as units of time the agent can survive given his current stock of goods. "initialize_image(num_x, num_y):" and "color_points(title, filename, category, mean_values,pp, tick, min_val, max_val):" are the methods that generalize the output.

Please note that this code will fail if the size of your data set is too large to be held by your RAM. In that case, you will need to use the dask library to instantiate the dataframes held by the elements in the "runs" and/or "mean_values" dataframes.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import pandas as pd
from matplotlib.backends.backend_pdf import PdfPages

files = 640
ticks = 15000
run_period = []
#np.empty((ticks))
num_x = 8
num_y = 8
runs_per_setting = 10

for j in range(0, ticks):
 run_period.insert(j, j + 1)
run_period = pd.Series(run_period)
def prepare_lists(files):

#    # number of elements recorded in csv by netlogo
    num_categories = 30

#    # number of elements created using data from netlogo
    additional_categories = 16
    total_categories = num_categories + additional_categories
    runs = pd.Series([np.arange(files)])    
    names = pd.Series(['sugar', 
                   'water',
                   'mean_price',
                   'price_variance',
                   'population', 
                   'basic_only',
                   'basic_herder',
                   'basic_arbitrageur', 
                   'basic_herder_arbitrageur', 
                   'switcher_only', 
                   'switcher_herder', 
                   'switcher_arbitrageur', 
                   'switcher_herder_arbitrageur', 
                   'percent_basic',
                   'percent_arbitrageur', 
                   'percent_herder', 
                   'percent_switcher', 
                   'basic_only_wealth',
                   'basic_herder_wealth',
                   'basic_arbitrageur_wealth',
                   'basic_herder_arbitrageur_wealth',
                   'switcher_only_wealth',
                   'switcher_herder_wealth',
                   'switcher_arbitrageur_wealth',
                   'switcher_herder_arbitrageur_wealth', 
                   'sugar_flow', 
                   'water_flow',
                   'distance_from_equilibrium_price',
                   'fifty_period_RAP', 
                   'mean_rate_of_price_change'])

    names_dict = {'sugar': [np.arange(ticks)], 
             'water': [np.arange(ticks)],
             'mean_price': [np.arange(ticks)], 
             'price_variance': [np.arange(ticks)],
             'population': [np.arange(ticks)],
             'basic_only': [np.arange(ticks)], 
             'basic_herder': [np.arange(ticks)],
             'basic_arbitrageur': [np.arange(ticks)],
             'basic_herder_arbitrageur': [np.arange(ticks)],
             'switcher_only': [np.arange(ticks)],
             'switcher_herder': [np.arange(ticks)],
             'switcher_arbitrageur': [np.arange(ticks)],
             'switcher_herder_arbitrageur': [np.arange(ticks)],
             'percent_basic': [np.arange(ticks)],
             'percent_arbitrageur': [np.arange(ticks)],
             'percent_herder': [np.arange(ticks)], 
             'percent_switcher': [np.arange(ticks)],
             'basic_only_wealth': [np.arange(ticks)],
             'basic_herder_wealth': [np.arange(ticks)],
             'basic_arbitrageur_wealth': [np.arange(ticks)],
             'basic_herder_arbitrageur_wealth': [np.arange(ticks)],
             'switcher_only_wealth': [np.arange(ticks)],
             'switcher_herder_wealth': [np.arange(ticks)],
             'switcher_arbitrageur_wealth': [np.arange(ticks)],
             'switcher_herder_arbitrageur_wealth': [np.arange(ticks)], 
             'sugar_flow': [np.arange(ticks)],
             'water_flow': [np.arange(ticks)],
             'distance_from_equilibrium_price': [np.arange(ticks)],
             'fifty_period_RAP': [np.arange(ticks)], 


            # Categories to be generated within python             
             'basic_only_wealth_per_capita': [np.arange(ticks)],
             'basic_herder_wealth_per_capita': [np.arange(ticks)],
             'basic_arbitrageur_wealth_per_capita': [np.arange(ticks)],
             'basic_herder_arbitrageur_wealth_per_capita': [np.arange(ticks)],
             'switcher_only_wealth_per_capita': [np.arange(ticks)],
             'switcher_herder_wealth_per_capita': [np.arange(ticks)],
             'switcher_arbitrageur_wealth_per_capita': [np.arange(ticks)],
             'switcher_herder_arbitrageur_wealth_per_capita': [np.arange(ticks)],             
             'percent_basic_only': [np.arange(ticks)],
             'percent_basic_herder': [np.arange(ticks)],
             'percent_basic_arbitraguer': [np.arange(ticks)],
             'percent_basic_herder_arbitraguer': [np.arange(ticks)],
             'percent_switcher_only': [np.arange(ticks)],
             'percent_switcher_herder': [np.arange(ticks)],
             'percent_switcher_arbitraguer': [np.arange(ticks)],
             'percent_switcher_herder_arbitraguer': [np.arange(ticks)],


            }
    for i  in range(1, files + 1):
        filename = str(i) + 'sugarscapeLocalTradeBasics.csv'
        runs[i] = pd.read_csv(filename, names = names)
#        , sep = None,engine='python')

        # Add categories generated from data manipulation
        runs[i]['basic_only_wealth_per_capita'] = runs[i]['basic_only_wealth'] / runs[i]['basic_only']
        runs[i]['basic_herder_wealth_per_capita'] = runs[i]['basic_herder_wealth'] / runs[i]['basic_herder']        
        runs[i]['basic_arbitrageur_wealth_per_capita'] = runs[i]['basic_arbitrageur_wealth'] / runs[i]['basic_arbitrageur']
        runs[i]['basic_herder_arbitrageur_wealth_per_capita'] = runs[i]['basic_herder_arbitrageur_wealth'] / runs[i]['basic_herder_arbitrageur']
        runs[i]['switcher_only_wealth_per_capita'] = runs[i]['switcher_only_wealth'] / runs[i]['switcher_only']
        runs[i]['switcher_herder_wealth_per_capita'] = runs[i]['switcher_herder_wealth'] / runs[i]['switcher_herder']        
        runs[i]['switcher_arbitrageur_wealth_per_capita'] = runs[i]['switcher_arbitrageur_wealth'] / runs[i]['switcher_arbitrageur']
        runs[i]['switcher_herder_arbitrageur_wealth_per_capita'] = runs[i]['switcher_herder_arbitrageur_wealth'] / runs[i]['switcher_herder_arbitrageur']
        runs[i]['percent_basic_only'] =  runs[i]['basic_only'] / runs[i]['population']
        runs[i]['percent_basic_herder'] =  runs[i]['basic_herder'] / runs[i]['population']
        runs[i]['percent_basic_arbitrageur'] = runs[i]['basic_arbitrageur'] / runs[i]['population']
        runs[i]['percent_basic_herder_arbitrageur'] = runs[i]['basic_herder_arbitrageur'] / runs[i]['population']
        runs[i]['percent_switcher_only'] = runs[i]['switcher_only'] / runs[i]['population']
        runs[i]['percent_switcher_herder'] =  runs[i]['switcher_herder']  / runs[i]['population'] 
        runs[i]['percent_switcher_arbitrageur'] = runs[i]['switcher_arbitrageur']  / runs[i]['population']
        runs[i]['percent_switcher_herder_arbitrageur'] =  runs[i]['switcher_herder_arbitrageur']  / runs[i]['population']
        
        # Keep track of run # processesed
        print(i)

    # will be used to record mean parameter values for particular settings 
    # across the behavior space
    mean_values = pd.DataFrame(columns = [np.arange(num_x)], index = [np.arange(num_y)])
    
#    run_name = np.empty((num_x, num_y, runs_per_setting), dtype = np.dtype((str,32))) 
    for x in range(0,num_x):
        for y in range(0,num_y):
            
            mean_values[x][y] = pd.DataFrame(names_dict)
    
    for x in range(0,num_x):
        for y in range(0,num_y ):
            for z in range(0,runs_per_setting):
                # Add 1 because file index starts at 1
            
#                run_name[x][y][z] = 'Cs = ' + str(.5 + x * .025) + ' Cw = ' + str(.5 + y * .025) + ' run ' + str(z + 1) + ' of ' + str(runs_per_setting)
                run_number = x * num_y * runs_per_setting + y * runs_per_setting + z + 1
                
                mean_values[x][y] = mean_values[x][y].add(runs[run_number], fill_value=0)
                
            print("processing: " + str(run_number))
    mean_values = mean_values / runs_per_setting
    print_behavior_space_representation(mean_values)
            
##############################################################################################                        
##############################################################################################



def initialize_image(num_x, num_y):
 image = []
 for i in range(num_y):
  x_colors = []
  for j in range(num_x):
   x_colors.append(0)
  image.append(x_colors)
 return image


def color_points(title, filename, category, mean_values,pp, tick, min_val, max_val):
    image = initialize_image(num_x, num_y)
    
    for i in range(0, num_x):
        for j in range(0, num_y):
            image[i][j] = mean_values[i][j].iloc[tick][category]
    print(image)        
    plt.imshow(image, origin='lower', extent=(.45, .8,  .45, .8),
               cmap=cm.Greys_r, interpolation = 'nearest')
    plt.colorbar()
    plt.clim(min_val, max_val)
    plt.xlabel('Water Consumption Rate')
    plt.ylabel('Sugar Consumption Rate')
    plt.title(title + " Period " +  str(tick + 1))
    fig = plt.gcf()
    plt.show()
    plt.draw()
    pp.savefig(fig)


def print_behavior_space_representation(data):  
    interval = 500
    pp = PdfPages('population_local_basic_tick_15000.pdf')
    for q in range(999, ticks, interval):    
        color_points("Population\nLocal Trade Basics","Population Local Trade Basics", "population", data, pp, q,250,1800)
    plt.close('all')
    pp.close()

    pp = PdfPages('PV_local_basic_tick_15000.pdf')
    for q in range(999, ticks, interval):    
        color_points("Price Variance\nLocal Trade Basics", "Price Variance Local Trade Basics", "price_variance", data, pp, q,0 , 8)
    plt.close('all')
    pp.close()

    pp = PdfPages('MP_local_basic_tick_15000.pdf')
    for q in range(999, ticks, interval):    
        color_points("Mean Price (Logged)\nLocal Trade Basics","Mean Price Local Trade Basics", "mean_price", data, pp, q, -1, 8)
    plt.close('all')
    pp.close()

    pp = PdfPages('DfEP_local_basic_tick_15000.pdf')
    for q in range(999, ticks, interval):    
        color_points("Distance From Predicted Equilibrium Price\nLocal Trade Basics", "Distance From Predicted Equilibrium Price Local Trade Basics", "distance_from_equilibrium_price", data, pp, q, -1, 8)
    plt.close('all')
    pp.close()

    pp = PdfPages('FPAMP_local_basic_tick_15000.pdf')
    for q in range(999, ticks, interval):    
        color_points("Fifty Period Average Moving Price (Logged) \nLocal Trade Basics", "Fifty Period Average Moving Price (Logged) Local Trade Basics", "fifty_period_RAP", data, pp, q, -1, 8)
    plt.close('all')
    pp.close()

    pp = PdfPages('MRoPC_local_basic_tick_15000.pdf')
    for q in range(999, ticks, interval):    
        color_points("Mean Rate of Price Change\nLocal Trade Basics", "Mean Rate of Price Change Local Trade Basics", "mean_rate_of_price_change", data, pp, q, 0, 1)
    plt.close('all')
    pp.close()

    pp = PdfPages('PB_local_basic_tick_15000.pdf')
    for q in range(999, ticks, interval):    
        color_points("Percent Basic \nLocal Trade Basics", "Percent Basic Local Trade Basics", "percent_basic", data, pp, q, 0, 1)
    plt.close('all')
    pp.close()

    pp = PdfPages('WPC_local_basic_tick_15000.pdf')
    for q in range(999, ticks, interval):        
        color_points("Wealth Per Capita \nLocal Trade Basics", "Wealth Per Capita Local Trade Basics", "basic_only_wealth_per_capita", data, pp, q, 50, 400)                
    plt.close('all')
    pp.close()
                
if __name__ == '__main__':     
    np.save('data',    prepare_lists(files)) 
    np.load('data.npy')

No comments:

Post a Comment