I decided to write another perl utility to stitch together indiviual .CSV files together into a single .CSV so that I could import the single .CSV and create area plots in Excel. That worked like a champ but there was a problem. The problem was that the hostnames of the virtual machines running on the frame are cryptic; they don't make any sense unless you have all of them memorized. This really frustrated me.
I had been creating the single .CSV and then adding an extra tab to the spreadsheet with the cryptic hostname and an simple description of the purpose of the VM so that my stacked area plots would be easier to read.
This worked like a champ, but the extra time to add the lookup information, add the =vlookup() function to the spreadsheet just got to be annoying. Once again, I decided to use perl to solve this problem for me.
First, the routine that I wrote to combine the individual .CSV files into a single .CSV. This routine uses a directory listing of the individual .CSV files that I want to stitch together. This is pretty simple to generate under Windows command line with the `dir *.csv /b > csv.list` command where csv.list is the list of .CSVs that I want to stitch together. All the individual .CSV files need to have the first column be a common date/time stamp, otherwise it won't work correctly. This isn't an issue since my tqharvc perl routine dumps the first column as a date/time stamp.
The code:
1:  use strict;  
2:    
3:  my $csvList = shift;  
4:  open(LIST, "$csvList") || die "$!";  
5:    
6:  my @csvArray = ();  
7:  my %matrixHash = ();  
8:    
9:  while (<LIST>) {  
10:   chomp($_);  
11:   push(@csvArray, $_);  
12:  };  
13:    
14:  foreach my $csvFile (@csvArray) {  
15:   open(CSV, "$csvFile") || die "$!";  
16:   print STDERR "Working on $csvFile\n";  
17:   while(<CSV>) {  
18:    if ($. > 1) {  
19:     chomp($_);  
20:     my @columns = split(/,/, $_);  
21:     $matrixHash{$columns[0]}{$csvFile} = $columns[1];  
22:    };  
23:   };  
24:   close(CSV);  
25:  };  
26:    
27:  # header line  
28:    
29:  print STDOUT "timeStamp";  
30:  foreach my $csvFile (@csvArray) {  
31:   my @columns = split(/_/, $csvFile);  
32:   print STDOUT ",$columns[0]";  
33:  };  
34:  print STDOUT "\n";  
35:    
36:  # actual data into CSV matrix  
37:    
38:  foreach my $timeStamp (sort(keys(%matrixHash))) {  
39:   print STDOUT "$timeStamp";  
40:   foreach my $csvFile (@csvArray) {  
41:    print STDOUT ",$matrixHash{$timeStamp}{$csvFile}"  
42:   };  
43:   print STDOUT "\n";  
44:  };  
45:    
The beauty of this code is that I can take tens, hundreds, even thousands of individual .CSV files and generate a single .CSV that can be imported into Excel for number crunching.
 vm1,Virtual Machine Number 1  
 vm2,Virtual Machine Number 2  
 vm3,Virtual Machine Number 3  
 vm4,Virtual Machine Number 4  
 vm5,Virtual Machine Number 5  
 vm6,Virtual Machine Number 6  
 vm7,Virtual Machine Number 7  
 vm8,Virtual Machine Number 8  
 vm9,Virtual Machine Number 9  
 vm10,Virtual Machine Number 10   
 vm11,Virtual Machine Number 11  
 vm12,Virtual Machine Number 12  
 vmvio_pri,Production VIO  
 vmvio_sec,Production VIO  
 vmvio_nonprod_pri,Non-Production VIO  
 vmvio_nonprod_sec,Non-Production VIO  
1:  use strict;  
2:  use GD::Graph::area;  
3:    
4:  my $csvData  = shift;  
5:  my $lookupData = shift;  
6:  my $graphName = shift;  
7:    
8:  $graphName .= ".gif";  
9:    
10:  open(DATA, "$csvData")   || die "$!";  
11:  open(LOOKUP, "$lookupData") || die "$!";  
12:    
13:  my %lookupHash = ();  
14:  my %areaHash  = ();  
15:  my @columnNames = ();  
16:    
17:  while(<LOOKUP>) {  
18:   chomp($_);  
19:   my ($lpar, $name) = split(/,/, $_);  
20:   $lookupHash{$lpar} = $name;  
21:  };  
22:    
23:  while(<DATA>) {  
24:   chomp($_);  
25:   my @columns = split(/,/, $_);  
26:   if ($. == 1) {  
27:    @columnNames = @columns;  
28:    foreach my $index (1..$#columns) {  
29:     # $columnNames[$index] = $lookupHash{$columnNames[$index]};  
30:     $columnNames[$index] .= " - $lookupHash{$columnNames[$index]}";  
31:    };  
32:   } else {  
33:    foreach my $i (1..$#columns) {  
34:     $areaHash{$columns[0]}{$columnNames[$i]} = $columns[$i];  
35:    };  
36:   };  
37:  };  
38:    
39:  my %timeStampsToDelete = ();  
40:    
41:  foreach my $timeStamp (keys(%areaHash)) {  
42:   foreach my $columnName (keys(%{$areaHash{$timeStamp}})) {  
43:    if ($areaHash{$timeStamp}{$columnName} < 0 || $areaHash{$timeStamp}{$columnName} eq '') {  
44:     $timeStampsToDelete{$timeStamp}++;  
45:    };  
46:   };  
47:  };  
48:    
49:  foreach my $timeStamp (sort(keys(%timeStampsToDelete))) {  
50:   print STDERR "Deleting $timeStamp from areaHash\n";  
51:   delete $areaHash{$timeStamp};  
52:  };  
53:    
54:  my @dataArray = ();  
55:  my @dataLine = ();  
56:    
57:  my $elementCount = 0;  
58:  my @legendArray = ();  
59:  my $i = 0;  
60:    
61:  foreach my $columnName (@columnNames) {  
62:   my @outputArray = ();  
63:   if ($columnName !~ /timeStamp/) {  
64:    my $j= 0;  
65:    foreach my $timeStamp (sort(keys(%areaHash))) {  
66:     $dataLine[$j++] = $areaHash{$timeStamp}{$columnName};  
67:    };  
68:    @outputArray = @dataLine;  
69:    push(@legendArray, $columnName);  
70:   } else {  
71:    foreach my $timeStamp (sort(keys(%areaHash))) {  
72:     push(@outputArray, $timeStamp);  
73:     $elementCount++;  
74:    };  
75:   };  
76:   $dataArray[$i] = \@outputArray;  
77:   $i++;  
78:  };  
79:     
80:  my $mygraph = GD::Graph::area->new(1280, 1024);  
81:    
82:  $mygraph->set(x_label_skip   => int($elementCount/40),  
83:         x_labels_vertical => 1,  
84:         y_label      => "physc",  
85:         y_min_value    => 0,  
86:         y_max_value    => 16,  
87:         y_tick_number   => 16,  
88:         title       => "Stacked physc utilization",  
89:         cumulate     => 1,  
90:  ) or warn $mygraph->error;  
91:    
92:  $mygraph->set_legend(@legendArray);  
93:    
94:  $mygraph->set_legend_font(GD::gdMediumBoldFont);  
95:  $mygraph->set_x_axis_font(GD::gdMediumBoldFont);  
96:  $mygraph->set_y_axis_font(GD::gdMediumBoldFont);  
97:    
98:  my $myimage = $mygraph->plot(\@dataArray) or die $mygraph->error;  
99:    
100:  open(PICTURE, ">$graphName");  
101:  binmode PICTURE;  
102:  print PICTURE $myimage->gif;  
103:  close(PICTURE);  
104:    
The output of the code becomes exactly what I wanted:
I don't believe that the area plot looks as nice as the Excel output, but now it is effortless to generate a stacked area plot of VM physc utilization as part of a daily cron job for reporting purposes.

 
No comments:
Post a Comment