Reading Data

class pygna.reading_class.ReadingData[source]

Abstract class used to read different types of file. You can implement your own reading method, but remember that each subclass must implement the readfile and get_data methods

get_data()[source]

Get the data from the reading class. This method must be always overridden

class pygna.reading_class.ReadTsv(filename: str, pd_table: bool = False, int_type: int = None)[source]

This class is used to read and parse a network file in a tab-separated format (tsv).

get_data() → pandas.core.frame.DataFrame[source]

Returns the data of the tsv file

Returns:list representing the genes read in the file

Example

>>> tsvdata = ReadTsv("mydata.tsv").get_data()
get_network() → networkx.classes.graph.Graph[source]

Returns the nx.graph object of the network

Returns:graph containing the network information

Example

>>> tsvdata = ReadTsv("mydata.tsv").get_network()
class pygna.reading_class.ReadGmt(filename: str, read_descriptor: bool = False)[source]

This class is used to read a gmt file, which contains information about the genes with a setname and separated by a comma

get_data() → dict[source]

Returns the data of the gmt file

Returns:dict representing the genes list

Example

>>> gmtdata = ReadGmt("mydata.gmt").get_data()
get_geneset(setname: str = None) → dict[source]

Returns the geneset from the gmt file

Parameters:setname – the setname to extract
Returns:the geneset data

Example

>>> gmtdata = ReadGmt("mydata.gmt").get_geneset("brca")
class pygna.reading_class.ReadCsv(filename: str, sep: str = ', ', use_cols: list = None, column_to_fill: str = None)[source]

This class is used to read a csv file.

get_data() → pandas.core.frame.DataFrame[source]

Returns the data of the csv file

Returns:dataframe representing the data read inside the .csv

Example

>>> csvdata = ReadCsv("mydata.csv").get_data()
class pygna.reading_class.ReadTxt(filename: str)[source]

This class reads a txt file containing a single gene per line

get_data() → pandas.core.frame.DataFrame[source]

Get the dataframe from the class

Returns:dataframe object from the file read

Example

>>> txtdata = ReadTxt("mydata.txt").get_data()
class pygna.reading_class.ReadDistanceMatrix(filename: str, in_memory: bool = False)[source]

This class read a distance matrix in the HDF5 format

get_data() → [<class 'list'>, <class 'numpy.matrix'>][source]

Return the data of the HDF5 Matrix

Returns:table data, the data of the HDF5 Matrix and table nodes, the nodes of the HDF5 Matrix

Example

>>> nodes, data = ReadDistanceMatrix("mydata.hdf5").get_data()