Test Set

Course Content

Accuracy Score

0 min

2 min

Activation Function

0 min

2 min

Algorithm

0 min

2 min

Assignment Operator (Python)

0 min

2 min

Artificial General Intelligence (AGI)

0 min

3 min

Artificial Intelligence

0 min

4 min

Artificial Narrow Intelligence (ANI)

0 min

3 min

Artificial Neural Network (ANN)

0 min

2 min

Backpropagation

0 min

2 min

10.

Bias

0 min

2 min

11.

Bias-Variance Tradeoff

0 min

2 min

12.

Big Data

0 min

2 min

13.

Business Analyst (BA)

0 min

2 min

14.

Business Analytics (BA)

0 min

2 min

15.

Business Intelligence (BI)

0 min

1 min

16.

Categorical Variable

0 min

1 min

17.

Clustering

0 min

2 min

18.

Command Line

0 min

1 min

19.

Computer Vision

0 min

2 min

20.

Continuous Variable

0 min

1 min

21.

Cost Function

0 min

2 min

22.

Cross-Validation

0 min

2 min

23.

Data Analysis

0 min

7 min

24.

Data Analyst

0 min

4 min

25.

Data Science

0 min

1 min

26.

Data Scientist

0 min

6 min

27.

Early Stopping

0 min

2 min

28.

Exploratory Data Analysis (EDA)

0 min

2 min

29.

False Negative

0 min

1 min

30.

False Positive

0 min

1 min

31.

Google Colaboratory

0 min

2 min

32.

Gradient Descent

0 min

2 min

33.

Hidden Layer

0 min

2 min

34.

Hyperparameter

0 min

2 min

35.

Image Recognition

0 min

2 min

36.

Imputation

0 min

2 min

37.

K-fold Cross Validation

0 min

2 min

38.

K-Means Clustering

0 min

2 min

39.

Linear Regression

0 min

2 min

40.

Logistic Regression

0 min

1 min

41.

Machine Learning Engineer (MLE)

0 min

5 min

42.

Mean

0 min

2 min

43.

Neural Network

0 min

2 min

44.

Notebook

0 min

3 min

45.

One-Hot Encoding

0 min

2 min

46.

Operand

0 min

1 min

47.

Operator (Python)

0 min

1 min

48.

Print Function (Python)

0 min

1 min

49.

Python

0 min

5 min

50.

Quantile

0 min

1 min

51.

Quartile

0 min

1 min

52.

Random Forest

0 min

2 min

53.

Recall

0 min

2 min

54.

Scalar

0 min

2 min

55.

Snake Case

0 min

1 min

56.

T-distribution

0 min

2 min

57.

T-test

0 min

2 min

58.

Tableau

0 min

2 min

59.

Target

0 min

1 min

60.

Tensor

0 min

2 min

61.

Tensor Processing Unit (TPU)

0 min

2 min

62.

TensorBoard

0 min

2 min

63.

TensorFlow

0 min

2 min

64.

Test Loss

0 min

2 min

65.

Time Series

0 min

2 min

66.

Time Series Data

0 min

2 min

67.

Test Set

0 min

2 min

68.

Tokenization

0 min

2 min

69.

Train Test Split

0 min

2 min

70.

Training Loss

0 min

2 min

71.

Training Set

0 min

2 min

72.

Transfer Learning

0 min

2 min

73.

True Negative (TN)

0 min

1 min

74.

True Positive (TP)

0 min

1 min

75.

Type I Error

0 min

2 min

76.

Type II Error

0 min

2 min

77.

Underfitting

0 min

2 min

78.

Undersampling

0 min

2 min

79.

Univariate Analysis

0 min

2 min

80.

Unstructured Data

0 min

2 min

81.

Unsupervised Learning

0 min

2 min

82.

Validation

0 min

2 min

83.

Validation Loss

0 min

1 min

84.

Vanishing Gradient Problem

0 min

2 min

85.

Validation Set

0 min

2 min

86.

Variable (Python)

0 min

1 min

87.

Variable Importances

0 min

2 min

88.

Variance

0 min

2 min

89.

Variational Autoencoder (VAE)

0 min

2 min

90.

Weight

0 min

1 min

91.

Word Embedding

0 min

2 min

92.

X Variable

0 min

2 min

93.

Y Variable

0 min

2 min

94.

Z-Score

0 min

1 min

Save
Run All Cells
Clear All Output
Runtime
Download
Difficulty Rating

Loading Runtime

In machine learning, a test set is a separate dataset used to assess the performance and generalization ability of a trained machine learning model. The test set is distinct from the training set, which is the data used to train the model. The primary purpose of the test set is to evaluate how well the model can make predictions on new, previously unseen data.

There are typically multiple datasets employed in training and evaluating machine learning models. At a minimum the following three datasets are required:

Training Set: This is the portion of the dataset used to train the machine learning model. The model learns patterns and relationships in the training data by adjusting its parameters (weights and biases) based on a specified objective, usually minimizing a loss function.
Validation Set (Best Practice): In addition to the training set, a separate validation set may be used during the training process to fine-tune hyperparameters and monitor the model's performance without overfitting to the training data.
Test Set: Once the model has been trained and potentially validated, it is evaluated on a completely independent dataset called the test set. The test set contains examples that the model has never seen during training or validation. Evaluating the model on this unseen data provides a more realistic assessment of its ability to generalize to new, unseen instances.

The process of splitting a dataset into training, validation, and test sets is crucial to ensure that the model's performance metrics accurately reflect its generalization capabilities. A common practice is to use a majority of the data for training, a smaller portion for validation, and a separate portion for testing. The exact split ratio depends on the size of the dataset and specific requirements.

By evaluating a model on a test set, machine learning practitioners can make informed decisions about the model's performance and its potential to make accurate predictions on new, real-world data.