Skip to main content

Data Types and Data Structures in Statistics

Data Types and Data Structures in : 


STATISTICS – Introduction :  

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data 

In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied.

Definition:

 Science of collection, presentation, analysis, and reasonable  interpretation of data.

• Statistics presents a rigorous scientific method for gaining insight into data

• statistics can give an instant overall picture of data based on graphical presentation or numerical summarization irrespective to the number of data points.   

• Besides data summarization, another important task of statistics is to make inference and predict relations of variables. 

  

What is Statistics: 


Statistics is a set of mathematical methods and tools that enable us to answer important questions about data. It is divided into two categories:

  1. Descriptive Statistics : this offers methods to summarise data by transforming raw observations into meaningful information that is easy to interpret and share.
  2. Inferential Statistics : this offers methods to study experiments done on small samples of data and chalk out the inferences to the entire population (entire domain).

Now, statistics and machine learning are two closely related areas of study. Statistics is an important prerequisite for applied machine learning, as it helps us select, evaluate and interpret predictive models


Statistics and Machine Learning :

The core of machine learning is centered around statistics. You can’t solve real-world problems with machine learning if you don’t have a good grip of statistical fundamentals.

There are certainly some factors that make learning statistics hard. I'm talking about mathematical equations, greek notation, and meticulously defined concepts that make it difficult to develop an interest in the subject.

We can address these issues with simple and clear explanations, appropriately paced tutorials, and hands-on labs to solve problems with applied statistical methods.

From exploratory data analysis to designing hypothesis testing experiments, statistics play an integral role in solving problems across all major industries and domains.

Anyone who wishes to develop a deep understanding of machine learning should learn how statistical methods form the foundation for regression algorithms and classification algorithms, how statistics allow us to learn from data, and how it helps us extract meaning from unlabeled data.

The Five Basic Words of Statistics: 

The five words Population, Sample, Parameter, Statistic (singular), and Variable form the basic vocabulary of statistics. 

You cannot learn much about statistics unless you first learn the meanings of these five words.

1. Population:  All the members of a group about which you want to draw a conclusion.

2. Sample: The part of the population selected for analysis.

3. Parameter: A numerical measure that describes a characteristic of a population.

4. Statistic: A numerical measure that describes a characteristic of a sample.

5. Variable: A characteristic of an item or an individual that will be analysed using statistics.

Comments

Popular posts from this blog

Navigating the GenAI Frontier: Transformers, GPT, and the Path to Accelerated Innovation ( LLM )

  Historical Context: Seq2Seq Paper and NMT by Joint Learning to Align & Translate Paper : * Seq2Seq model introduced in "Sequence to Sequence Learning with Neural Networks" paper by Sutskever et al., revolutionized NLP with end-to-end learning. For example, translating "Bonjour" to "Hello" without handcrafted features. * NMT by "Joint Learning to Align and Translate" (Luong et al.) improved Seq2Seq with attention mechanisms. For instance, aligning "Bonjour" and "Hello" more accurately based on context. Introduction to Transformers (Paper: Attention is all you need) : * Transformers introduced in "Attention is All You Need" paper. * Replaced RNN-based models with attention mechanisms, making them highly parallelizable and efficient. Why transformers : * Transformers capture long-range dependencies effectively. * They use self-attention to process tokens in parallel and capture global context efficiently.   ...

Language Modeling ( LLM )

Language Modeling ( LLM ) *   What is Language Modeling :   Language modeling powers modern NLP by predicting words based on context, using statistical analysis of vast text data. It's essential for tasks like word prediction and speech development, driving innovation in understanding human language. Types of language models :  N-gram Models  : These models predict the next word based on the preceding n-1 words, where n is the order of the model. For example, a trigram model (n=3) predicts the next word using the two preceding words. N-gram Language Model Example : Consider a simple corpus of text: I love hiking in the mountains. The mountains are beautiful. Trigram Model: P("in" | "hiking", "love") = Count("hiking love in") / Count("hiking love") P("are" | "mountains", "the") = Count("mountains the are") / Count("mountains the") Neural Language Models: Neural network-based language...

Python String.

  Innomatics Research Labs  Im a innomatics student, this is for innomatics research labs notes (the best institute of hyderabad) once search in google, course = data science and full stack development Innomatics Research Labs  is a pioneer in  “Transforming Career and Lives”  of individuals in the Digital Space by catering advanced training on  IBM Certified Data Science , Python, IBM Certified Predictive Analytics Modeler, Machine Learning, Artificial Intelligence (AI),  Full-stack web development , Amazon Web Services (AWS), DevOps, Microsoft Azure, Big data Analytics,   Digital Marketing , and Career Launching program   for students  who are willing to showcase their skills in the competitive job market with valuable credentials, and also can complete courses with a certificate. Strings: strings are a sequence or a char, enclosed in qoutes. creating a string a string can be created by  encloing a char or sequence of chars  ...