Cosan  1.0
Data Analytics Library
Cosan::CosanData< NumericType > Class Template Reference

Data container. More...

#include <CosanData.h>

Inheritance diagram for Cosan::CosanData< NumericType >:
Cosan::CosanRawData< NumericType > Cosan::CosanBO

Public Member Functions

 CosanData ()=default
 
 CosanData (gsl::index nrows, gsl::index ncols, NumericType lb=0, NumericType ub=1)
 Generate random matrix with each entry uniformly sampled from lb to lb. Dimension is nrows by ncols. More...
 
 CosanData (const CosanMatrix< NumericType > &inputX)
 Get CosanData from CosanMatrix<NUmericType> inputX. More...
 
 CosanData (const CosanMatrix< NumericType > &inputX, const CosanMatrix< NumericType > &inputY)
 Get CosanData from CosanMatrix<NUmericType> inputX, inputY. More...
 
 CosanData (const std::vector< NumericType > &inputX, gsl::index nrows, const std::string &order="rowfirst")
 Get CosanData from std::vector inputX, fill the data either by 'rowfirst' or 'columnfirst'. More...
 
 CosanData (const std::vector< NumericType > &inputX, const std::vector< NumericType > &inputY, gsl::index nrows, const std::string &order="rowfirst")
 Get CosanData from std::vector inputX and inputY, fill the data either by 'rowfirst' or 'columnfirst'. More...
 
virtual const std::string GetName () const
 Get the name of the object. More...
 
- Public Member Functions inherited from Cosan::CosanRawData< NumericType >
 CosanRawData ()=default
 
 CosanRawData (const std::string &srcX, const std::string &srcY)
 Constructor: Read data X and Y from csv files and form raw data container. More...
 
 CosanRawData (const std::string &srcX)
 Constructor: Read data X from csv and form raw data container. More...
 
void SetInput (const std::string &srcX)
 Update input X from csv file. More...
 
void SetTarget (const std::string &srcY)
 Update target Y from csv file. More...
 
void ConcatenateData (const CosanMatrix< NumericType > &inputX)
 Concatenate X using CosanMatrix<NumericType> input X. Add new columns. More...
 
void UpdateData (const CosanMatrix< NumericType > &inputX)
 Update X using CosanMatrix<NumericType> input X. More...
 
void UpdateData (const CosanMatrix< NumericType > &inputX, const CosanMatrix< NumericType > &inputY)
 Update X and Y using CosanMatrix<NumericType> inputX,inputY. More...
 
void UpdateCat (const std::vector< std::string > &inputX)
 Update categorical vector svaluesX using std::vector<std::string> & inputX. More...
 
void UpdateCat (const std::vector< std::string > &inputX, const std::vector< std::string > &inputY)
 Update categorical vector svaluesX,svaluesY using std::vector<std::string> & inputX,inputY. More...
 
CosanMatrix< NumericTypeGetInput ()
 Get a copy of CosanMatrix<NumericType> X. More...
 
CosanMatrix< NumericTypeGetTarget ()
 Get a copy of CosanMatrix<NumericType> Y. More...
 
const CosanMatrix< NumericType > & GetInput () const
 Get a const reference to const CosanMatrix<NumericType> X. More...
 
const CosanMatrix< NumericType > & GetTarget () const
 Get a const reference to const CosanMatrix<NumericType> Y. More...
 
std::tuple< gsl::index, gsl::index > GetMissingNumber ()
 Get the total number data information. More...
 
const std::string & GetSummaryMessageX () const
 Get the summary message on reading csv file on X. More...
 
const std::string & GetSummaryMessageY () const
 Get the summary message on reading csv file on Y. More...
 
std::unordered_map< gsl::index, gsl::index > & GetRawToNumIdx ()
 Raw data column index to numeric data matrix X column index. More...
 
std::unordered_map< gsl::index, gsl::index > & GetRawToCatIdx ()
 Raw data column index to categorical data column index. More...
 
std::vector< std::vector< gsl::index > > GetIdxpinfX () const
 Get the position of positive infinity in the origin data X. More...
 
std::vector< std::vector< gsl::index > > GetIdxminfX () const
 Get the position of negative infinity in the origin data X. More...
 
std::vector< std::vector< gsl::index > > GetIdxmissingX () const
 Get the position of missing in the origin data X. More...
 
std::vector< std::vector< gsl::index > > GetIdxpinfY () const
 Get the position of positive infinity in the origin data Y. More...
 
std::vector< std::vector< gsl::index > > GetIdxminfY () const
 Get the position of negative infinity in the origin data Y. More...
 
std::vector< std::vector< gsl::index > > GetIdxmissingY () const
 Get the position of missing in the origin data Y. More...
 
std::set< gsl::index > GetcolCatX () const
 Get the column index (in the origin X of csv file) where the column is of categorical type. More...
 
std::set< gsl::index > GetcolCatY () const
 Get the column index (in the origin Y of csv file) where the column is of categorical type. More...
 
bool GetcatY () const
 True if Y is categorical data type. False otherwise. More...
 
gsl::index GetrowsX ()
 Get the number of rows for X. More...
 
gsl::index GetrowsY ()
 Get the number of rows for Y. More...
 
gsl::index GetcolsX ()
 Get the number of columns for X. More...
 
gsl::index GetcolsY ()
 Get the number of columns for Y. More...
 
std::vector< std::string > GetsvaluesX () const
 Get the vector of categorical data from X. order: row first. More...
 
std::vector< std::string > GetsvaluesY () const
 Get the vector of categorical data from Y. order: row first. More...
 
CosanMatrix< NumericTypeGetType ()
 
- Public Member Functions inherited from Cosan::CosanBO
 CosanBO ()
 Default constructor. More...
 

Additional Inherited Members

- Protected Attributes inherited from Cosan::CosanRawData< NumericType >
CosanMatrix< NumericTypeX
 Numeric data from origin CSV file for X. More...
 
CosanMatrix< NumericTypeY
 Numeric data from origin CSV file for Y. More...
 
CosanMatrix< NumericType__TYPE
 
std::string SummaryMessageX
 Loading message. More...
 
std::string SummaryMessageY
 
std::vector< std::vector< gsl::index > > IdxpinfX
 position for positive, negative infinity and missing values. More...
 
std::vector< std::vector< gsl::index > > IdxminfX
 
std::vector< std::vector< gsl::index > > IdxmissingX
 
std::vector< std::vector< gsl::index > > IdxpinfY
 
std::vector< std::vector< gsl::index > > IdxminfY
 
std::vector< std::vector< gsl::index > > IdxmissingY
 
std::set< gsl::index > colCatX
 column idx in the origin data that is categorical data. More...
 
std::set< gsl::index > colCatY
 
bool catY = false
 true mean respone variable Y is categorical data. More...
 
gsl::index rowsX = 0
 number of rows. More...
 
gsl::index colsX = 0
 
gsl::index rowsY = 0
 number of columns More...
 
gsl::index colsY = 0
 
std::vector< std::string > svaluesX
 
std::vector< std::string > svaluesY
 

Detailed Description

template<Numeric NumericType>
class Cosan::CosanData< NumericType >

Data container.

Every constructor needs to have at least one input. To obtain CosanData, the following constructors can be used:

CosanData(gsl::index nrows,gsl::index ncols,NumericType lb=0,NumericType ub = 1)
CosanData(const CosanMatrix<NumericType> & inputX)
CosanData(const CosanMatrix<NumericType>& inputX,const CosanMatrix<NumericType>& inputY)
CosanData(const std::vector<NumericType>& inputX,gsl::index nrows,const std::string & order = "rowfirst")
CosanData(const std::vector<NumericType>& inputX,const std::vector<NumericType>& inputY,gsl::index nrows,const std::string & order = "rowfirst")

Definition at line 546 of file CosanData.h.

Constructor & Destructor Documentation

◆ CosanData() [1/6]

template<Numeric NumericType>
Cosan::CosanData< NumericType >::CosanData ( )
default

◆ CosanData() [2/6]

template<Numeric NumericType>
Cosan::CosanData< NumericType >::CosanData ( gsl::index  nrows,
gsl::index  ncols,
NumericType  lb = 0,
NumericType  ub = 1 
)
inline

Generate random matrix with each entry uniformly sampled from lb to lb. Dimension is nrows by ncols.

Definition at line 552 of file CosanData.h.

552  :CosanRawData<NumericType>(){
553  this->X.resize(nrows,ncols);
554  std::default_random_engine generator;
555  std::uniform_real_distribution<double> distribution(lb,ub);
556  for (gsl::index i = 0;i<nrows*ncols;i++){
557  this->X(i/ncols,i%ncols) =distribution(generator);
558  }
559  }

◆ CosanData() [3/6]

template<Numeric NumericType>
Cosan::CosanData< NumericType >::CosanData ( const CosanMatrix< NumericType > &  inputX)
inline

Get CosanData from CosanMatrix<NUmericType> inputX.

Definition at line 563 of file CosanData.h.

563  :CosanRawData<NumericType>(){
564  static_assert(std::is_arithmetic<NumericType>::value, "NumericType must be numeric");
565  this->X = inputX;
566  }

◆ CosanData() [4/6]

template<Numeric NumericType>
Cosan::CosanData< NumericType >::CosanData ( const CosanMatrix< NumericType > &  inputX,
const CosanMatrix< NumericType > &  inputY 
)
inline

Get CosanData from CosanMatrix<NUmericType> inputX, inputY.

Definition at line 570 of file CosanData.h.

570  :CosanRawData<NumericType>(){
571  static_assert(std::is_arithmetic<NumericType>::value, "NumericType must be numeric");
572  this->X = inputX;
573  this->Y = inputY;}

◆ CosanData() [5/6]

template<Numeric NumericType>
Cosan::CosanData< NumericType >::CosanData ( const std::vector< NumericType > &  inputX,
gsl::index  nrows,
const std::string &  order = "rowfirst" 
)
inline

Get CosanData from std::vector inputX, fill the data either by 'rowfirst' or 'columnfirst'.

Definition at line 577 of file CosanData.h.

577  :CosanRawData<NumericType>(){
578  if (nrows>inputX.size() || inputX.size()%nrows!=0){
579  throw std::invalid_argument(
580  fmt::format("Incorrect nrows specification, should be less than or equal to input vector size and size is divisible by nrows. Input vector size is "
581  "{:} and nrows is {:}",inputX.size(),nrows));
582  }
583  this->X.resize(nrows,inputX.size()/nrows);
584  gsl::index i =0,__cols=inputX.size()/nrows;
585  if (order=="columnfirst"){
586  i =0;
587  for (auto &each :inputX ){
588  this->X(i%nrows,i/nrows) = each;
589  i++;
590  }
591  return;
592  }
593  i =0;
594  for (auto &each :inputX ){
595  this->X(i/__cols,i%__cols) = each;
596  i++;
597  }
598  return;
599  }

◆ CosanData() [6/6]

template<Numeric NumericType>
Cosan::CosanData< NumericType >::CosanData ( const std::vector< NumericType > &  inputX,
const std::vector< NumericType > &  inputY,
gsl::index  nrows,
const std::string &  order = "rowfirst" 
)
inline

Get CosanData from std::vector inputX and inputY, fill the data either by 'rowfirst' or 'columnfirst'.

Definition at line 605 of file CosanData.h.

605  :CosanRawData<NumericType>(){
606  if (nrows>inputX.size() || inputX.size()%nrows!=0 || nrows!=inputY.size()){
607  throw std::invalid_argument(
608  fmt::format("Incorrect nrows specification, should be less than or equal to input vector size and size is divisible by nrows. inputY size should also be equal to nrows."
609  "inputX vector size is {:}, inputY vector size is {:} and nrows is {:}",inputX.size(),inputY.size(),nrows));
610  }
611  this->X.resize(nrows,inputX.size()/nrows);
612  this->Y.resize(nrows,1);
613  gsl::index i =0,__cols=inputX.size()/nrows;
614  for (auto &each:inputY){
615  this->Y(i,0) = each;
616  i++;
617  }
618  if (order=="columnfirst"){
619  i =0;
620  for (auto &each :inputX ){
621  this->X(i%nrows,i/nrows) = each;
622  i++;
623  }
624  return;
625  }
626  i =0;
627  for (auto &each :inputX ){
628  this->X(i/__cols,i%__cols) = each;
629  i++;
630  }
631  return;
632  }

Member Function Documentation

◆ GetName()

template<Numeric NumericType>
virtual const std::string Cosan::CosanData< NumericType >::GetName ( ) const
inlinevirtual

Get the name of the object.

Reimplemented from Cosan::CosanRawData< NumericType >.

Definition at line 637 of file CosanData.h.

637 {return "Processed Data Object.";}

The documentation for this class was generated from the following file:
NumericType
double NumericType
Definition: onehotencodingTest.cpp:20
Cosan::CosanRawData::X
CosanMatrix< NumericType > X
Numeric data from origin CSV file for X.
Definition: CosanData.h:303
Cosan::CosanRawData::Y
CosanMatrix< NumericType > Y
Numeric data from origin CSV file for Y.
Definition: CosanData.h:307
Cosan::CosanData::CosanData
CosanData()=default