Commit b38e4178 authored by Nitin Shukla's avatar Nitin Shukla
Browse files

DataScienceWithJulia

parent 46816def
%% Cell type:markdown id: tags:
## DataFrames
- is a tabular representation of data, similar to a spread- sheet or a data matrix
- The observations are rows and the variables are columns
$$
\mathrm{x} =
\begin{pmatrix}
x_{11} & x_{12} \\
x_{21} & x_{22} \\
x_{31} & x_{32} \\
. & . \\
. & . \\
x_{n1} & x_{n2}
\end{pmatrix}
$$
A dataframe is a computer representation of a data matrix.
%% Cell type:markdown id: tags:
#### In Julia, the DataFrame type is available through the DataFrames.jl package.
There are several convenient features of a DataFrame, including:
- columns can be different Julia types;
- table cell entries can be missing;
- metadata can be associated with a DataFrame;
- columns can be names; and
- tables can be subsetted by row, column or both.
- The columns of a DataFrame are most often integers, floats or strings, and they are specified by Julia symbols.
%% Cell type:code id: tags:
``` julia
import Pkg; Pkg.instantiate()
```
%% Cell type:code id: tags:
``` julia
# Packages
using BenchmarkTools
using DataFrames
using DelimitedFiles
using CSV
using XLSX
using Downloads
```
%% Cell type:code id: tags:
``` julia
using DataFrames, Distributions, StatsBase, Random
```
%% Cell type:code id: tags:
``` julia
Random.seed!(825);
N = 50;
## Create a sample of dataFrame
## Initially the DataFrame has N rows and 3 columns
df1 = DataFrame(
x1 = rand(Normal(2,1), N),
x2 = [sample(["High", "Medium", "Low"],
pweights([0.25,0.45,0.30])) for i=1:N],
x3 = rand(Pareto(2, 1), N)
)
```
%% Cell type:markdown id: tags:
# Read your data from text files
%% Cell type:code id: tags:
``` julia
Data = CSV.read("MostPopLang.csv", DataFrame);
```
%% Cell type:code id: tags:
``` julia
@show typeof(Data)
```
%% Cell type:code id: tags:
``` julia
size(Data)
```
%% Cell type:code id: tags:
``` julia
Data[190:203, :]
```
%% Cell type:code id: tags:
``` julia
@show typeof(Data)
```
%% Cell type:code id: tags:
``` julia
```
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment