How to separate my huge dataset into bins and average them or aggregate
data by Dates in R?
I'm trying to develop a program to allow visualization of big data in
graphs. Basically, the idea is that I can input a huge dataset and output
a line graph in which I can actually see the trends.
Here is my idea (please let me know if there are already algorithms like
this built into R or in a package, as I realize this is a very basic or
'primitive' way of aggregating data. I also don't want to use sample()
because I am specifically looking for trends in data. I realize that there
is always going to be a trade-off between accuracy of data and ease of
data representation in this case.):
Let's say I have a standard csv dataset of 10,000 numeric rows (columns
representing variables).I want to create a resultant dataset that takes
this huge dataset and separates it into 20-30 bins, each bin representing
a datapoint that is the average of a certain number of data points in the
big dataset. For example, if I had 10 bins, each bin would be the average
of 1,000 datapoints.
Here is my code:
average <- function(dataf)
{
  numericdata <- dataf[,sapply(dataf,is.numeric)]
  ***mean(numericData, trim = 0, na.rm = TRUE)
}
x <- names(numericData)
real <- ddply(diamonds, .(x), average)
***I do not know what to do here. Here is the place where I want to
separate the numbericdata into a certain number of bins, in which the data
in each bin will be averaged out.
On another important note, most of my datasets that I input will have Date
variables (this is why I mentioned a line graph). The mean() function only
works on numeric data, so how could I average out a time column? By
averaging out, I mean that the time column was in YYYY-MM-DD format, I can
aggregate the days and graph the data by month. If this is the case, then
I would not even have to worry about averaging the other columns! How can
I do this?
Thanks for any input, and sorry for the long post, I felt like I needed to
provide all the necessary information.
 
No comments:
Post a Comment