I decided to write another perl utility to stitch together indiviual .CSV files together into a single .CSV so that I could import the single .CSV and create area plots in Excel. That worked like a champ but there was a problem. The problem was that the hostnames of the virtual machines running on the frame are cryptic; they don't make any sense unless you have all of them memorized. This really frustrated me.
I had been creating the single .CSV and then adding an extra tab to the spreadsheet with the cryptic hostname and an simple description of the purpose of the VM so that my stacked area plots would be easier to read.
This worked like a champ, but the extra time to add the lookup information, add the =vlookup() function to the spreadsheet just got to be annoying. Once again, I decided to use perl to solve this problem for me.
First, the routine that I wrote to combine the individual .CSV files into a single .CSV. This routine uses a directory listing of the individual .CSV files that I want to stitch together. This is pretty simple to generate under Windows command line with the `dir *.csv /b > csv.list` command where csv.list is the list of .CSVs that I want to stitch together. All the individual .CSV files need to have the first column be a common date/time stamp, otherwise it won't work correctly. This isn't an issue since my tqharvc perl routine dumps the first column as a date/time stamp.
The code:
1: use strict;
2:
3: my $csvList = shift;
4: open(LIST, "$csvList") || die "$!";
5:
6: my @csvArray = ();
7: my %matrixHash = ();
8:
9: while (<LIST>) {
10: chomp($_);
11: push(@csvArray, $_);
12: };
13:
14: foreach my $csvFile (@csvArray) {
15: open(CSV, "$csvFile") || die "$!";
16: print STDERR "Working on $csvFile\n";
17: while(<CSV>) {
18: if ($. > 1) {
19: chomp($_);
20: my @columns = split(/,/, $_);
21: $matrixHash{$columns[0]}{$csvFile} = $columns[1];
22: };
23: };
24: close(CSV);
25: };
26:
27: # header line
28:
29: print STDOUT "timeStamp";
30: foreach my $csvFile (@csvArray) {
31: my @columns = split(/_/, $csvFile);
32: print STDOUT ",$columns[0]";
33: };
34: print STDOUT "\n";
35:
36: # actual data into CSV matrix
37:
38: foreach my $timeStamp (sort(keys(%matrixHash))) {
39: print STDOUT "$timeStamp";
40: foreach my $csvFile (@csvArray) {
41: print STDOUT ",$matrixHash{$timeStamp}{$csvFile}"
42: };
43: print STDOUT "\n";
44: };
45:
The beauty of this code is that I can take tens, hundreds, even thousands of individual .CSV files and generate a single .CSV that can be imported into Excel for number crunching.
vm1,Virtual Machine Number 1
vm2,Virtual Machine Number 2
vm3,Virtual Machine Number 3
vm4,Virtual Machine Number 4
vm5,Virtual Machine Number 5
vm6,Virtual Machine Number 6
vm7,Virtual Machine Number 7
vm8,Virtual Machine Number 8
vm9,Virtual Machine Number 9
vm10,Virtual Machine Number 10
vm11,Virtual Machine Number 11
vm12,Virtual Machine Number 12
vmvio_pri,Production VIO
vmvio_sec,Production VIO
vmvio_nonprod_pri,Non-Production VIO
vmvio_nonprod_sec,Non-Production VIO
Fortunately, GD::Graph::area will automagically generate a stacked area plot if the proper option is set in the hash for the area plot on line #89 which is "cumulate => 1,"1: use strict;
2: use GD::Graph::area;
3:
4: my $csvData = shift;
5: my $lookupData = shift;
6: my $graphName = shift;
7:
8: $graphName .= ".gif";
9:
10: open(DATA, "$csvData") || die "$!";
11: open(LOOKUP, "$lookupData") || die "$!";
12:
13: my %lookupHash = ();
14: my %areaHash = ();
15: my @columnNames = ();
16:
17: while(<LOOKUP>) {
18: chomp($_);
19: my ($lpar, $name) = split(/,/, $_);
20: $lookupHash{$lpar} = $name;
21: };
22:
23: while(<DATA>) {
24: chomp($_);
25: my @columns = split(/,/, $_);
26: if ($. == 1) {
27: @columnNames = @columns;
28: foreach my $index (1..$#columns) {
29: # $columnNames[$index] = $lookupHash{$columnNames[$index]};
30: $columnNames[$index] .= " - $lookupHash{$columnNames[$index]}";
31: };
32: } else {
33: foreach my $i (1..$#columns) {
34: $areaHash{$columns[0]}{$columnNames[$i]} = $columns[$i];
35: };
36: };
37: };
38:
39: my %timeStampsToDelete = ();
40:
41: foreach my $timeStamp (keys(%areaHash)) {
42: foreach my $columnName (keys(%{$areaHash{$timeStamp}})) {
43: if ($areaHash{$timeStamp}{$columnName} < 0 || $areaHash{$timeStamp}{$columnName} eq '') {
44: $timeStampsToDelete{$timeStamp}++;
45: };
46: };
47: };
48:
49: foreach my $timeStamp (sort(keys(%timeStampsToDelete))) {
50: print STDERR "Deleting $timeStamp from areaHash\n";
51: delete $areaHash{$timeStamp};
52: };
53:
54: my @dataArray = ();
55: my @dataLine = ();
56:
57: my $elementCount = 0;
58: my @legendArray = ();
59: my $i = 0;
60:
61: foreach my $columnName (@columnNames) {
62: my @outputArray = ();
63: if ($columnName !~ /timeStamp/) {
64: my $j= 0;
65: foreach my $timeStamp (sort(keys(%areaHash))) {
66: $dataLine[$j++] = $areaHash{$timeStamp}{$columnName};
67: };
68: @outputArray = @dataLine;
69: push(@legendArray, $columnName);
70: } else {
71: foreach my $timeStamp (sort(keys(%areaHash))) {
72: push(@outputArray, $timeStamp);
73: $elementCount++;
74: };
75: };
76: $dataArray[$i] = \@outputArray;
77: $i++;
78: };
79:
80: my $mygraph = GD::Graph::area->new(1280, 1024);
81:
82: $mygraph->set(x_label_skip => int($elementCount/40),
83: x_labels_vertical => 1,
84: y_label => "physc",
85: y_min_value => 0,
86: y_max_value => 16,
87: y_tick_number => 16,
88: title => "Stacked physc utilization",
89: cumulate => 1,
90: ) or warn $mygraph->error;
91:
92: $mygraph->set_legend(@legendArray);
93:
94: $mygraph->set_legend_font(GD::gdMediumBoldFont);
95: $mygraph->set_x_axis_font(GD::gdMediumBoldFont);
96: $mygraph->set_y_axis_font(GD::gdMediumBoldFont);
97:
98: my $myimage = $mygraph->plot(\@dataArray) or die $mygraph->error;
99:
100: open(PICTURE, ">$graphName");
101: binmode PICTURE;
102: print PICTURE $myimage->gif;
103: close(PICTURE);
104:
The output of the code becomes exactly what I wanted:
I don't believe that the area plot looks as nice as the Excel output, but now it is effortless to generate a stacked area plot of VM physc utilization as part of a daily cron job for reporting purposes.