Black Friday and Cyber Monday has come and gone and the team and I kicked ass... We kicked major ass. All that work over the past 11 months paid off over that single week of retail Hell and our systems just simply kicked ass...
But now, it is time for all the vacation that I didn't take during the year while prepping for the few busiest days of the year. I am taking off the entire month of December (and still rolling 40 hours of vacation into next year!)
What to do with all this time?
On CyberMonday O'Reilly had a 60% off sale on books and videos so I decided to purchase a few videos to get up to speed in some technologies that I haven't fiddled around with.
Over December I hope to go over the following videos:
An Introduction to MapReduce with Pete Warden
Erlang by Example with Cesarini and Thompson
Hilary Mason: An Introduction to Machine Learning with Web Data
I've had a very brief introduction to Machine Learning during a GDAT class with Dr. Gunther and found it very interesting and something that could be cool for automated monitoring of systems to determine when something is going astray.
Also I plan on covering some Haskell learnings with the Channel9 lecture series on the subject of Haskell and functional programming. I've done a little bit of FP with R and it is time to wrap my head around FP more with Haskell and Erlang.
The current plan on the Haskell front is to gather with fellow Geeks at the Addison, TX location of Thoughtworks for Geeknight to watch the Channel9 videos and learn the ways of FP first hand.
I'm sure that good times will ensue!
Adventures in capacity planning, heuristic analysis and general sadistic treatment of servers.
Saturday, December 3, 2011
Wednesday, November 23, 2011
Wrapping perl around tqharvc for generating scatter plots
TeamQuest View is a handy utility for looking at metrics collected on servers by TeamQuest. But one thing that I haven't liked about TeamQuest View is creating graphs from collected metrics. Fortunately, with the installation of TeamQuest View a command line utility by the name of tqharvc is also installed. It is possible to execute tqharvc.exe from the command line to collect metrics and create plots from the output.
I got the idea of writing some perl code that slices and dices the TeamQuest .RPT files that can be generated by TeamQuest View and then passing on the queries to tqharvc to a graphing routine to generate both plots and .CSV files for later processing if need be.
For the graphing of data I utilize GD::Graph for generating a scatter plot.
For the example in this blog entry I have a very simple .RPT file by the name of physc.RPT and as the name implies, it simply reports physc usage from a LPAR:
Simple enough, isn't it?
My perl routine builds up a query for tqharvc by slicing and dicing the .RPT file and iterates through all the sliced and diced metrics, collects the output data and generates the .CSV output and plot image.
The tqharvc command line query is formed as:
Say that I want to query "serverA" for the metric "CPU:by LPAR:physc" between 11/1/2011 midnight to 11/07/2011 11:59:59 PM with metrics from every 10 minutes?
I call tqharvc with the following command line:
My routine takes several command line parameters:
server.list is simply a text file with a list of servers to query:
phsyc.rpt is the aforementioned .RPT file. 11/01/2011 is the start date and 00:0:000 is the start time. 11/07/2011 is the end date and 23:59:59 is the end time of the queries.
Below is the output to STDERR from the perl routine as it crunches through the data from the servers:
The output of the plot appears as:

With the above command line I was able to generate plots for six servers in the server.list file.
If the .RPT file had listed multiple metrics, there would have been one plot generated per server for each metric. I chose the single metric physc as a simple example.
The perl code to generate the plot follows:
I got the idea of writing some perl code that slices and dices the TeamQuest .RPT files that can be generated by TeamQuest View and then passing on the queries to tqharvc to a graphing routine to generate both plots and .CSV files for later processing if need be.
For the graphing of data I utilize GD::Graph for generating a scatter plot.
For the example in this blog entry I have a very simple .RPT file by the name of physc.RPT and as the name implies, it simply reports physc usage from a LPAR:
[General]
Report Name = "physc"
View = Line
Format = Date+Time,
Parameter
[dParm1]
System = "[?]"
Category Group = "CPU"
Category = "by LPAR"
Subcategory = ""
Statistic = "physc"
Value Types = "Average"
Simple enough, isn't it?
My perl routine builds up a query for tqharvc by slicing and dicing the .RPT file and iterates through all the sliced and diced metrics, collects the output data and generates the .CSV output and plot image.
The tqharvc command line query is formed as:
tqharvc.exe -m [hostName] -s [startDate]:[startTime] -e [endDate]:[endTime] -a 10-minute -k [metric]
Say that I want to query "serverA" for the metric "CPU:by LPAR:physc" between 11/1/2011 midnight to 11/07/2011 11:59:59 PM with metrics from every 10 minutes?
I call tqharvc with the following command line:
tqharvc.exe -m serverA -s 11012011:000000 -e 11072011:235959 -a 10-minute -k "CPU:by LPAR:physc"
My routine takes several command line parameters:
tqGatherMetrics.pl server.list physc.rpt 11/01/2011 00:00:00 11/06/2011 23:59:59
server.list is simply a text file with a list of servers to query:
serverA
serverB
serverC
serverD
serverE
serverF
phsyc.rpt is the aforementioned .RPT file. 11/01/2011 is the start date and 00:0:000 is the start time. 11/07/2011 is the end date and 23:59:59 is the end time of the queries.
Below is the output to STDERR from the perl routine as it crunches through the data from the servers:
Running query against serverA.
Executing TQ Query for serverA:CPU:by LPAR.
Processing query results.
Extracted 864 lines of data from query.
Processing results for serverA : CPU:by LPAR : 7
Generating graph for "serverA_CPU_by_LPAR_physc.gif"
Running query against serverB.
Executing TQ Query for serverB:CPU:by LPAR.
Processing query results.
Extracted 864 lines of data from query.
Processing results for serverB : CPU:by LPAR : 7
Generating graph for "serverB_CPU_by_LPAR_physc.gif"
Running query against serverC.
Executing TQ Query for serverC:CPU:by LPAR.
Processing query results.
Extracted 864 lines of data from query.
Processing results for serverC : CPU:by LPAR : 7
Generating graph for "serverC_CPU_by_LPAR_physc.gif"
Running query against serverD.
Executing TQ Query for serverD:CPU:by LPAR.
Processing query results.
Extracted 864 lines of data from query.
Processing results for serverD : CPU:by LPAR : 7
Generating graph for "serverD_CPU_by_LPAR_physc.gif"
Running query against serverE.
Executing TQ Query for serverE:CPU:by LPAR.
Processing query results.
Extracted 864 lines of data from query.
Processing results for serverE : CPU:by LPAR : 7
Generating graph for "serverE_CPU_by_LPAR_physc.gif"
Running query against serverF.
Executing TQ Query for serverF:CPU:by LPAR.
Processing query results.
Extracted 864 lines of data from query.
Processing results for serverF : CPU:by LPAR : 7
Generating graph for "serverF_CPU_by_LPAR_physc.gif"
The output of the plot appears as:

With the above command line I was able to generate plots for six servers in the server.list file.
If the .RPT file had listed multiple metrics, there would have been one plot generated per server for each metric. I chose the single metric physc as a simple example.
The perl code to generate the plot follows:
1: use GD::Graph::points;
2: use Statistics::Descriptive;
3: use Time::Local;
4: use POSIX qw/ceil/;
5:
6: use strict;
7:
8: my $sourceList = shift;
9: my $tqReport = shift;
10: my $startDate = shift;
11: my $startTime = shift;
12: my $endDate = shift;
13: my $endTime = shift;
14:
15: $startDate =~ s/\///g;
16: $endDate =~ s/\///g;
17:
18: $startTime =~ s/\://g;
19: $endTime =~ s/\://g;
20:
21: if (length($sourceList) == 0) {
22: die "Usage:tqGatherMerics.pl <sourceList> <tqReport> <startDate> <stareTime> <endDate> <endTime>\n";
23: };
24:
25: my $tqHarvestBinary = "C:\\Program Files\\TeamQuest\\manager\\bin\\tqharvc.exe";
26:
27: if ( -f "$tqHarvestBinary" ) {
28: if ( -f "$sourceList" ) {
29: if ( -f "$tqReport" ) {
30:
31: my @hostNames = ();
32: my %metricHash = ();
33:
34: open(HOSTNAMES, "$sourceList") || die "$!";
35: while (<HOSTNAMES>) {
36: chomp($_);
37: if ($_ !~ /^#/) {
38: push(@hostNames, $_);
39: };
40: };
41: close(HOSTNAMES);
42:
43: open(REPORT, "$tqReport") || die "$!";
44:
45: my $catGroup = "::";
46: my $catName = "::";
47: my $subCat = "::";
48:
49: while (<REPORT>) {
50:
51: my @statArray = ();
52:
53: chomp($_);
54:
55: if ($_ =~ /Category Group = \"(.+?)\"/) {
56: $catGroup = $1;
57: };
58:
59: if ($_ =~ /Category = \"(.+?)\"/) {
60: $catName = $1;
61: };
62:
63: if ($_ =~ /Subcategory = \"(.+?)\"/) {
64: $subCat = $1;
65: };
66:
67: if ($_ =~ /Statistic =/) {
68: my $tmpString = "";
69: $_ =~ s/Statistic =//g;
70: $tmpString = $_;
71: do {
72: $_ = <REPORT>;
73: chomp($_);
74: if ($_ !~ /^Resource|^Value Types/) {
75: $tmpString .= $_;
76: };
77: } until ($_ =~ /^Resource|^Value Types/);
78: my @statArray = split(/\,/, $tmpString);
79: $metricHash{"${catGroup}:${catName}"} = \@statArray;
80: };
81:
82: };
83:
84: close(REPORT);
85:
86: foreach my $hostName (@hostNames) {
87: print STDERR "Running query against $hostName.\n";
88: my %metricData = ();
89: foreach my $paramName (sort(keys(%metricHash))) {
90: my %columnHash = ();
91: my $linesExtracted = 0;
92: my $shellCmd = "\"$tqHarvestBinary\" -m $hostName -s $startDate:$startTime -e $endDate:$endTime -a 10-minute -k \"$paramName\"";
93: # my $shellCmd = "\"$tqHarvestBinary\" -m $hostName -s $startDate:$startTime -e $endDate:$endTime -a 1-minute -k \"$paramName\"";
94: print STDERR "\t\tExecuting TQ Query for $hostName:$paramName.\n";
95:
96: open(OUTPUT, "$shellCmd |") || die "$!";
97:
98: print STDERR "\t\tProcessing query results.\n";
99:
100: my $totalColumns = 0;
101:
102: while (<OUTPUT>) {
103: chomp($_);
104: if ($_ =~ /^Time:/) {
105: my @columns = split(/\,/, $_);
106: my $statName = "";
107: for (my $index = 0; $index < $#columns; $index++) {
108: foreach $statName (@{$metricHash{$paramName}}) {
109: $statName =~ s/^\s+//g; # ltrim
110: $statName =~ s/\s+$//g; # rtrim
111: $statName =~ s/\"//g;
112:
113: my $columnName = $columns[$index];
114: if (index($columnName, $statName, 0) >= 0) {
115: $columnHash{$index} = $columns[$index];
116: $totalColumns++;
117: };
118: };
119: };
120: } else {
121: if ($_ =~ /^[0-9]/) {
122: chomp($_);
123: my @columns = split(/\,/, $_);
124: foreach my $index (sort(keys(%columnHash))) {
125: $metricData{"$columns[0] $columns[1]"}{$columnHash{$index}} = $columns[$index];
126: };
127: $linesExtracted++;
128: };
129: };
130: };
131:
132: close(OUTPUT);
133:
134: if (($linesExtracted > 0) && ($totalColumns > 0)) {
135: print STDERR "\tExtracted $linesExtracted lines of data from query.\n";
136: my @domainData = ();
137:
138: foreach my $timeStamp (sort dateSort keys(%metricData)) {
139: push(@domainData, $timeStamp);
140: };
141:
142: foreach my $metricIndex (sort(keys(%columnHash))) {
143: print STDERR "\t\tProcessing results for $hostName : $paramName : $metricIndex\n";
144:
145: my $metricName = $columnHash{$metricIndex};
146: my @rangeData = ();
147: my $stat = Statistics::Descriptive::Full->new();
148:
149: foreach my $timeStamp (@domainData) {
150: push(@rangeData, $metricData{$timeStamp}{$columnHash{$metricIndex}});
151: $stat->add_data($metricData{$timeStamp}{$columnHash{$metricIndex}});
152: };
153:
154: my $graphName = "${hostName}_${paramName}_${metricName}";
155: my $csvName = "";
156:
157: $graphName =~ s/\\/_/g;
158: $graphName =~ s/\//_/g;
159: $graphName =~ s/\%/_/g;
160: $graphName =~ s/\:/_/g;
161: $graphName =~ s/\s/_/g;
162:
163: $csvName = $graphName;
164: $graphName .= ".gif";
165: $csvName .= ".csv";
166:
167: print STDERR "\t\tGenerating graph for \"$graphName\"\n";
168:
169: open(CSVOUTPUT, ">$csvName");
170: print CSVOUTPUT "Timestamp,$paramName:$metricName\n";
171:
172: my $i = 0;
173: foreach my $timeStamp (@domainData) {
174: print CSVOUTPUT "$domainData[$i],$rangeData[$i]\n";
175: $i++;
176: };
177:
178: close(CSVOUTPUT);
179:
180: my $dataMax = $stat->max();
181: my $dataMin = $stat->min();
182:
183: if ($dataMax < 1) {
184: $dataMax = 0.5;
185: };
186:
187: if ($dataMin > 0) {
188: $dataMin = 0;
189: };
190:
191: for (my $rangeIndex = 0; $rangeIndex < $#rangeData; $rangeIndex++) {
192: if ($rangeData[$rangeIndex] == 0) {
193: $rangeData[$rangeIndex] = $dataMax * 5;
194: };
195: };
196:
197: my @data = (\@domainData, \@rangeData);
198: my $tqGraph = GD::Graph::points->new(1024, int(768/2));
199: my $totalMeasurements = $#{$data[0]} + 1;
200:
201: $tqGraph->set(x_label_skip => int($#domainData/40),
202: x_labels_vertical => 1,
203: markers => [6],
204: marker_size => 2,
205: y_label => "$metricName",
206: y_min_value => $dataMin,
207: y_max_value => ceil($dataMax * 1.1),
208: title => "${hostName}:${paramName}:${metricName}, N = " . $stat->count(). "; avg = " . $stat->mean() . "; SD = " . $stat->standard_deviation() . "; 90th = " . $stat->percentile(90) . ".",
209: line_types => [1],
210: line_width => 1,
211: dclrs => ['red'],
212: ) or warn $tqGraph->error;
213:
214: $tqGraph->set_legend("TQ Measurement");
215: $tqGraph->set_legend_font(GD::gdMediumBoldFont);
216: $tqGraph->set_x_axis_font(GD::gdMediumBoldFont);
217: $tqGraph->set_y_axis_font(GD::gdMediumBoldFont);
218:
219: my $tqImage = $tqGraph->plot(\@data) or die $tqGraph->error;
220:
221: open(PICTURE, ">$graphName");
222: binmode PICTURE;
223: print PICTURE $tqImage->gif;
224: close(PICTURE);
225:
226: };
227:
228: } else {
229: print STDERR "#################### Nothing extracted for Hostname \"$hostName\" and metric \"$paramName\" ####################\n";
230: };
231:
232: };
233: };
234:
235: } else {
236: print STDERR "Could not find the TeamQuest Report file.\n";
237: };
238: } else {
239: print STDERR "Could not find the list of hostnames to run against.\n";
240: };
241: } else {
242: print STDERR "Could not find the TeamQuest Manager TQHarvest binary at \"$tqHarvestBinary\". Cannot continue.\n";
243: };
244:
245: sub dateSort {
246: my $a_value = dateToEpoch($a);
247: my $b_value = dateToEpoch($b);
248:
249: if ($a_value > $b_value) {
250: return 1;
251: } else {
252: if ($b_value > $a_value) {
253: return -1;
254: } else {
255: return 0;
256: };
257: };
258:
259: };
260:
261: sub dateToEpoch {
262: my ($timeStamp) = @_;
263: my ($dateString, $timeString) = split(/ /, $timeStamp);
264: my ($month, $day, $year) = split(/\//, $dateString);
265: my ($hour, $min, $sec) = split(/:/, $timeString);
266:
267: $year += 2000;
268: $month -= 1;
269:
270: return timegm($sec,$min,$hour,$day,$month,$year);
271:
272: };
273:
Sunday, November 20, 2011
Discrete Event Simulation of the Three Tier eBiz with SimPy
In this post I blogged on how I wrote a simulation solution in PDQ-R for a case study used in the book, Performance By Design.
I also wanted to expand my horizons and write a Discrete Event Simulation solution to go along with the heuristic solution of PDQ-R. The firm that I was working for at the time when I came up with this solution had a DES package by the name of SimScript II but they weren't willing to cough up a license so that I could learn their product. So, I searched the web and found a python package by the name of SimPy that can be used for Discrete Event Simulation.
There are some major differences between the heuristic solution and the DES solution. With the heuristic solution I was able to use some linear algebra to determine how many many hits per second to send off to the various pages that consumed resources. With my SimPy solution I did not have that luxury as I am simulating individual hits to each page one at a time. To get a proper distribution of hits I came up with this simple routine to dispatch the hits to pages:
For example, below I have the TransitionMatrix as used in my python code:
And we are currently on the search page, as represented by row #2 of the above stochastic matrix:
In a future blog post I will compare the results of the Discrete Event Simulation versus my PDQ-R solution and finally compare both of those against a work up done with TeamQuest Model just for good measure.
I also wanted to expand my horizons and write a Discrete Event Simulation solution to go along with the heuristic solution of PDQ-R. The firm that I was working for at the time when I came up with this solution had a DES package by the name of SimScript II but they weren't willing to cough up a license so that I could learn their product. So, I searched the web and found a python package by the name of SimPy that can be used for Discrete Event Simulation.
There are some major differences between the heuristic solution and the DES solution. With the heuristic solution I was able to use some linear algebra to determine how many many hits per second to send off to the various pages that consumed resources. With my SimPy solution I did not have that luxury as I am simulating individual hits to each page one at a time. To get a proper distribution of hits I came up with this simple routine to dispatch the hits to pages:
1: class RandomPath:
2: def RowSum(self, Vector):
3: rowSum = 0.0
4: for i in range(len(Vector)):
5: rowSum += Vector[i]
6: return rowSum
7: def NextPage(self, T, i):
8: rowSum = self.RowSum(T[i])
9: randomValue = G.Rnd.uniform(0, rowSum)
10: sumT = 0.0
11: for j in range(len(T[i])):
12: sumT += T[i][j]
13: if randomValue < sumT:
14: break
15: return j
With this routine I take the sum of probability from a row of the TransitionMatrix hard coded in the routine and then generate a random value between zero and the sum of the probabilities. I then walk the row and summing the probabilities until I find a condition where the probablities are greater than the random value. The count when this happens determines the next page.For example, below I have the TransitionMatrix as used in my python code:
88: # 0 1 2 3 4 5 6 7
89: TransitionMatrix = [ [0.00, 1.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00], # 0
90: [0.00, 0.00, 0.70, 0.00, 0.10, 0.00, 0.00, 0.20], # 1
91: [0.00, 0.00, 0.45, 0.15, 0.10, 0.00, 0.00, 0.30], # 2
92: [0.00, 0.00, 0.00, 0.00, 0.40, 0.00, 0.00, 0.60], # 3
93: [0.00, 0.00, 0.00, 0.00, 0.00, 0.30, 0.55, 0.15], # 4
94: [0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 1.00], # 5
95: [0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 1.00], # 6
96: [0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00] ]# 7
And we are currently on the search page, as represented by row #2 of the above stochastic matrix:
0 1 2 3 4 5 6 7
91: [0.00, 0.00, 0.45, 0.15, 0.10, 0.00, 0.00, 0.30], # 2
Column 0 is 0, my sum is 0.
Column 1 is 0, my sum is 0.
Column 2 is 0.45, my sum is 0.45. 0.45 is less than my random value, 0.67.
Column 3 is 0.15, my sum is 0.60. 0.60 is less than my random value, 0.67.
Column 4 is 0.10, my sum is 0.70. 0.70 is greater than my random value, 0.67. I pass back 4 to my routine and the next row in the random walk will be row 4 (which represents the login page).
What we end up is a Markov random walk of the transition matrix. Multiple comparisons of random walks versus the heuristic analysis was fairly close. Of course, with a random variable it is never the same but comes out fairly close with each iteration.
The great thing about this algorithm is that it can be used for both Discrete Event Simulation and in load testing as well. At an eCommerce site that I worked I had created a perl routine to slice and dice IIS logs and by using unique shopper ID in the logged cookies I had created a Markov transition matrix of actual customer traffic. The resulting matrix wasn't small. For our site, it was over 500 x 500 elements. That's quite a lot of different random paths to take.
The idea was to take this information and modify our LoadRunner HTTP Vusers to make use of this data so that our scripts would properly emulate actual customers as apposed to crude page summary analysis. With this mechanism, our load tests would properly represent customers and we would have been able to run our analysis once a week to always make sure that our scenarios represented how customers browsed the site.
Unfortunately, I never got to put my ideas into action and only got as far as creating the stochastic transition matrix and some crude perl code that demonstrated random walks that would have represented customers. All in all, this was the easy part. The real hard part of the project would have been coding the HTTP Vusers that would have properly handled state transition to page to page for non-happy routes that customers sometimes followed.
Below is the SimPy solution that I came up with for the case study presented:
1: #!/usr/bin/env python
2: from SimPy.Simulation import *
3: from random import Random, expovariate, uniform
4: class Metrics:
5: metrics = dict()
6: def Add(self, metricName, frameNumber, value):
7: if self.metrics.has_key(metricName):
8: if self.metrics[metricName].has_key(frameNumber):
9: self.metrics[metricName][frameNumber].append(value)
10: else:
11: self.metrics[metricName][frameNumber] = list()
12: self.metrics[metricName][frameNumber].append(value)
13: else:
14: self.metrics[metricName] = dict()
15: self.metrics[metricName][frameNumber] = list()
16: self.metrics[metricName][frameNumber].append(value)
17: def Keys(self):
18: return self.metrics.keys()
19: def Mean(self, metricName):
20: valueArray = list()
21: if self.metrics.has_key(metricName):
22: for frame in self.metrics[metricName].keys():
23: for values in range(len(self.metrics[metricName][frame])):
24: valueArray.append(self.metrics[metricName][frame][values])
25: sum = 0.0
26: for i in range(len(valueArray)):
27: sum += valueArray[i]
28: if len(self.metrics[metricName][frame]) != 0:
29: return sum/len(self.metrics[metricName])
30: else:
31: return 0 # Need to learn python throwing exceptions
32: else:
33: return 0
34: class RandomPath:
35: def RowSum(self, Vector):
36: rowSum = 0.0
37: for i in range(len(Vector)):
38: rowSum += Vector[i]
39: return rowSum
40: def NextPage(self, T, i):
41: rowSum = self.RowSum(T[i])
42: randomValue = G.Rnd.uniform(0, rowSum)
43: sumT = 0.0
44: for j in range(len(T[i])):
45: sumT += T[i][j]
46: if randomValue < sumT:
47: break
48: return j
49: class G:
50: numWS = 1
51: numAS = 1
52: numDS = 2
53: Rnd = random.Random(12345)
54: PageNames = ["Entry", "Home", "Search", "View", "Login", "Create", "Bid", "Exit" ]
55: Entry = 0
56: Home = 1
57: Search = 2
58: View = 3
59: Login = 4
60: Create = 5
61: Bid = 6
62: Exit = 7
63: WS = 0
64: AS = 1
65: DS = 2
66: CPU = 0
67: DISK = 1
68: WS_CPU = 0
69: WS_DISK = 1
70: AS_CPU = 2
71: AS_DISK = 3
72: DS_CPU = 4
73: DS_DISK = 5
74: metrics = Metrics()
75: # e h s v l c b e
76: HitCount = [0, 0, 0, 0, 0, 0, 0, 0]
77: Resources = [[ Resource(1), Resource(1) ], # WS CPU and DISK
78: [ Resource(1), Resource(1) ], # AS CPU and DISK
79: [ Resource(1), Resource(1) ]] # DS CPU and DISK
80: # Enter Home Search View Login Create Bid Exit
81: ServiceDemand = [ [0.000, 0.008, 0.009, 0.011, 0.060, 0.012, 0.015, 0.000], # WS_CPU
82: [0.000, 0.030, 0.010, 0.010, 0.010, 0.010, 0.010, 0.000], # WS_DISK
83: [0.000, 0.000, 0.030, 0.035, 0.025, 0.045, 0.040, 0.000], # AS_CPU
84: [0.000, 0.000, 0.008, 0.080, 0.009, 0.011, 0.012, 0.000], # AS_DISK
85: [0.000, 0.000, 0.010, 0.009, 0.015, 0.070, 0.045, 0.000], # DS_CPU
86: [0.000, 0.000, 0.035, 0.018, 0.050, 0.080, 0.090, 0.000] ] # DS_DISK
87: # Type B shopper
88: # 0 1 2 3 4 5 6 7
89: TransitionMatrix = [ [0.00, 1.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00], # 0
90: [0.00, 0.00, 0.70, 0.00, 0.10, 0.00, 0.00, 0.20], # 1
91: [0.00, 0.00, 0.45, 0.15, 0.10, 0.00, 0.00, 0.30], # 2
92: [0.00, 0.00, 0.00, 0.00, 0.40, 0.00, 0.00, 0.60], # 3
93: [0.00, 0.00, 0.00, 0.00, 0.00, 0.30, 0.55, 0.15], # 4
94: [0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 1.00], # 5
95: [0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 1.00], # 6
96: [0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00] ] # 7
97: class DoWork(Process):
98: def __init__(self, i, resource, serviceDemand, nodeName, pageName):
99: Process.__init__(self)
100: self.frame = i
101: self.resource = resource
102: self.serviceDemand = serviceDemand
103: self.nodeName = nodeName
104: self.pageName = pageName
105: def execute(self):
106: StartUpTime = now()
107: yield request, self, self.resource
108: yield hold, self, self.serviceDemand
109: yield release, self, self.resource
110: R = now() - StartUpTime
111: G.metrics.Add(self.pageName, self.frame, R)
112: class CallPage(Process):
113: def __init__(self, i, node, pageName):
114: Process.__init__(self)
115: self.frame = i
116: self.StartUpTime = 0.0
117: self.currentPage = node
118: self.pageName = pageName
119: def execute(self):
120: if self.currentPage != G.Exit:
121: print >> sys.stderr, "Working on Frame # ", self.frame, " @ ", now() , " for page ", self.pageName
122: self.StartUpTime = now()
123: if G.ServiceDemand[G.WS_CPU][self.currentPage] > 0.0:
124: wsCPU = DoWork(self.frame, G.Resources[G.WS][G.CPU], G.ServiceDemand[G.WS_CPU][self.currentPage]/G.numWS, "wsCPU", self.pageName)
125: activate(wsCPU, wsCPU.execute())
126: if G.ServiceDemand[G.WS_DISK][self.currentPage] > 0.0:
127: wsDISK = DoWork(self.frame, G.Resources[G.WS][G.DISK], G.ServiceDemand[G.WS_DISK][self.currentPage]/G.numWS, "wsDISK", self.pageName)
128: activate(wsDISK, wsDISK.execute())
129: if G.ServiceDemand[G.AS_CPU][self.currentPage] > 0.0:
130: asCPU = DoWork(self.frame, G.Resources[G.AS][G.CPU], G.ServiceDemand[G.AS_CPU][self.currentPage]/G.numAS, "asCPU", self.pageName)
131: activate(asCPU, asCPU.execute())
132: if G.ServiceDemand[G.AS_DISK][self.currentPage] > 0.0:
133: asDISK = DoWork(self.frame, G.Resources[G.AS][G.DISK], G.ServiceDemand[G.AS_DISK][self.currentPage]/G.numAS, "asDISK", self.pageName)
134: activate(asDISK, asDISK.execute())
135: if G.ServiceDemand[G.DS_CPU][self.currentPage] > 0.0:
136: dsCPU = DoWork(self.frame, G.Resources[G.DS][G.CPU], G.ServiceDemand[G.DS_CPU][self.currentPage]/G.numDS, "dsCPU", self.pageName)
137: activate(dsCPU, dsCPU.execute())
138: if G.ServiceDemand[G.DS_DISK][self.currentPage] > 0.0:
139: dsDISK = DoWork(self.frame, G.Resources[G.DS][G.DISK], G.ServiceDemand[G.DS_DISK][self.currentPage]/G.numDS, "dsDISK", self.pageName)
140: activate(dsDISK, dsDISK.execute())
141: G.HitCount[self.currentPage] += 1
142: yield hold, self, 0.00001 # Needed to prevent an error. Doesn't add any blocking to the six queues above
143: class Generator(Process):
144: def __init__(self, rate, maxT, maxN):
145: Process.__init__(self)
146: self.name = "Generator"
147: self.rate = rate
148: self.maxN = maxN
149: self.maxT = maxT
150: self.g = Random(11335577)
151: self.i = 0
152: self.currentPage = G.Home
153: def execute(self):
154: while (now() < self.maxT):
155: self.i += 1
156: p = CallPage(self.i,self.currentPage,G.PageNames[self.currentPage])
157: activate(p,p.execute())
158: yield hold,self,self.g.expovariate(self.rate)
159: randomPath = RandomPath()
160: if self.currentPage == G.Exit:
161: self.currentPage = G.Home
162: else:
163: self.currentPage = randomPath.NextPage(G.TransitionMatrix, self.currentPage)
164: def main():
165: maxWorkLoad = 10000
166: Lambda = 4.026*float(sys.argv[1])
167: maxSimTime = float(sys.argv[2])
168: initialize()
169: g = Generator(Lambda, maxSimTime, maxWorkLoad)
170: activate(g,g.execute())
171: simulate(until=maxSimTime)
172: print >> sys.stderr, "Simulated Seconds : ", maxSimTime
173: print >> sys.stderr, "Page Hits :"
174: for i in range(len(G.PageNames)):
175: print >> sys.stderr, "\t", G.PageNames[i], " = ", G.HitCount[i]
176: print >> sys.stderr, "Throughput : "
177: for i in range(len(G.PageNames)):
178: print >> sys.stderr, "\t", G.PageNames[i], " = ", G.HitCount[i]/maxSimTime
179: print >> sys.stderr, "Mean Response Times:"
180: for i in G.metrics.Keys():
181: print >> sys.stderr, "\t", i, " = ", G.metrics.Mean(i)
182: print G.HitCount[G.Home]/maxSimTime, ",", G.metrics.Mean("Home"), ",", G.metrics.Mean("View"), ",", G.metrics.Mean("Search"), ",", G.metrics.Mean("Login"), ",", G.metrics.Mean("Create"), ",", G.metrics.Mean("Bid")
183: if __name__ == '__main__': main()
In a future blog post I will compare the results of the Discrete Event Simulation versus my PDQ-R solution and finally compare both of those against a work up done with TeamQuest Model just for good measure.
Subscribe to:
Posts (Atom)