graforvfl.network package¶

graforvfl.network.base_rvfl module¶

class graforvfl.network.base_rvfl.BaseRVFL(size_hidden=10, act_name='sigmoid', weight_initializer='random_uniform', trainer='MPI', alpha=0.5, seed=None)[source]¶

Bases: BaseEstimator

This class defines the general Random Vector Functional Link (RVFL) network. It is a single-hidden layer network with direct connection between input and output.

Parameters:

size_hidden (int, default=10) – Number of nodes in the hidden layer.
act_name (str, default="sigmoid") – Name of the activation function for the hidden layer. Supported values include: [“none”, “relu”, “leaky_relu”, “celu”, “prelu”, “gelu”, “elu”, “selu”, “rrelu”, “tanh”, “hard_tanh”, “sigmoid”, “hard_sigmoid”, “log_sigmoid”, “silu”, “swish”, “hard_swish”, “soft_plus”, “mish”, “soft_sign”, “tanh_shrink”, “soft_shrink”, “hard_shrink”, “softmin”, “softmax”, “log_softmax” ]
weight_initializer (str, default="random_uniform") – Method for initializing weights (input-hidden weights). Supported methods include: [“orthogonal”, “he_uniform”, “he_normal”, “glorot_uniform”, “glorot_normal”, “lecun_uniform”, “lecun_normal”, “random_uniform”, “random_normal”] For definition of these methods, please check it at: https://keras.io/api/layers/initializers/
trainer (str, default = "MPI") –
The utilized method for training weights of hidden-output layer and weights of input-output layer.
- MPI: Moore-Penrose inversion (Ordinary Least Squares without regularization)
- L2: Ordinary Least Squares (OLS) regression with regularization
alpha (float (Optional), default=0.5) – Regularization parameter for L2 training. Effective only when trainer=”L2”.
seed (int, default=None) – Determines random number generation for weights and bias initialization. Pass an int for reproducible results across multiple function calls.

weights¶

Dictionary containing the initialized weights for hidden layers and output layers.

Type:: dict

act_func¶

The activation function applied to the hidden layer.

Type:: callable

size_input¶

Number of features in the input data.

Type:: int

size_output¶

Number of outputs based on the target data dimensionality.

Type:: int

loss_train¶

Stores the loss history during training, if applicable.

Type:: list

CLS_OBJ_LOSSES = None¶

SUPPORTED_ACTIVATION = ['none', 'relu', 'leaky_relu', 'celu', 'prelu', 'gelu', 'elu', 'selu', 'rrelu', 'tanh', 'hard_tanh', 'sigmoid', 'hard_sigmoid', 'log_sigmoid', 'silu', 'swish', 'hard_swish', 'soft_plus', 'mish', 'soft_sign', 'tanh_shrink', 'soft_shrink', 'hard_shrink', 'softmin', 'softmax', 'log_softmax']¶

SUPPORTED_CLS_METRICS = {'AS': 'max', 'BSL': 'min', 'CEL': 'min', 'CKS': 'max', 'F1S': 'max', 'F2S': 'max', 'FBS': 'max', 'GINI': 'min', 'GMS': 'max', 'HL': 'min', 'HS': 'max', 'JSI': 'max', 'KLDL': 'min', 'LS': 'max', 'MCC': 'max', 'NPV': 'max', 'PS': 'max', 'ROC-AUC': 'max', 'RS': 'max', 'SS': 'max'}¶

SUPPORTED_REG_METRICS = {'A10': 'max', 'A20': 'max', 'A30': 'max', 'ACOD': 'max', 'APCC': 'max', 'AR': 'max', 'AR2': 'max', 'CI': 'max', 'COD': 'max', 'COR': 'max', 'COV': 'max', 'CRM': 'min', 'DRV': 'min', 'EC': 'max', 'EVS': 'max', 'GINI': 'min', 'GINI_WIKI': 'min', 'JSD': 'min', 'KGE': 'max', 'MAAPE': 'min', 'MAE': 'min', 'MAPE': 'min', 'MASE': 'min', 'ME': 'min', 'MRB': 'min', 'MRE': 'min', 'MSE': 'min', 'MSLE': 'min', 'MedAE': 'min', 'NNSE': 'max', 'NRMSE': 'min', 'NSE': 'max', 'OI': 'max', 'PCC': 'max', 'PCD': 'max', 'R': 'max', 'R2': 'max', 'R2S': 'max', 'RAE': 'min', 'RMSE': 'min', 'RSE': 'min', 'RSQ': 'max', 'SMAPE': 'min', 'VAF': 'max', 'WI': 'max'}¶

SUPPORTED_WEIGHT_INITIALIZER = ['orthogonal', 'he_uniform', 'he_normal', 'glorot_uniform', 'glorot_normal', 'lecun_uniform', 'lecun_normal', 'random_uniform', 'random_normal']¶

evaluate(y_true, y_pred, list_metrics=None)[source]¶: Default interface for evaluate function

fit(X, y)[source]¶

Fit the RVFL model to the training data.

Parameters:

X (ndarray of shape (n_samples, n_features)) – Training input features.
y (ndarray of shape (n_samples,) or (n_samples, n_outputs)) – Target values.

Returns:

self – The fitted model.

Return type:

BaseRVFL

get_weights()[source]¶

Retrieve the current weights of the RVFL model.

Returns:: weights – Dictionary containing the current model weights.
Return type:: dict

get_weights_size()[source]¶

Calculate the total number of parameters in the model.

Returns:: size – Total number of parameters across all weights.
Return type:: int

static load_model(load_path='history', filename='network.pkl')[source]¶

Load a saved model from a pickle file.

Parameters:

load_path (str, default="history") – Directory containing the saved file.
filename (str, default="network.pkl") – Name of the file (must end with .pkl).

Returns:

model – Loaded model instance.

Return type:

BaseRVFL

predict(X)[source]¶

Predict target values using the fitted RVFL model.

Parameters:: X (ndarray of shape (n_samples, n_features)) – Input data.
Returns:: y_pred – Predicted target values.
Return type:: ndarray

predict_proba(X)[source]¶

Predict probabilities (or scores) for classification tasks.

Parameters:: X (ndarray of shape (n_samples, n_features)) – Input data.
Returns:: y_pred – Predicted probabilities or scores.
Return type:: ndarray

save_loss_train(save_path='history', filename='loss.csv')[source]¶

Save the loss (convergence) during the training process to csv file.

Parameters:

save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".csv" extension) –

save_metrics(y_true, y_pred, list_metrics=('RMSE', 'MAE'), save_path='history', filename='metrics.csv')[source]¶

Save evaluation metrics to csv file

Parameters:

y_true (ndarray) – Ground truth target values.
y_pred (ndarray) – Predicted target values.
list_metrics (list of str, default=("RMSE", "MAE")) – List of metrics to calculate.
save_path (str, default="history") – Directory to save the file.
filename (str, default="metrics.csv") – Name of the file (must end with .csv).

save_model(save_path='history', filename='network.pkl')[source]¶

Save network to pickle file

Parameters:

save_path (str, default="history") – Directory to save the file.
filename (str, default="network.pkl") – Name of the file (must end with .pkl).

save_y_predicted(X, y_true, save_path='history', filename='y_predicted.csv')[source]¶

Save the predicted results to csv file

Parameters:

X (ndarray) – Input features.
y_true (ndarray) – Ground truth target values.
save_path (str, default="history") – Directory to save the file.
filename (str, default="y_predicted.csv") – Name of the file (must end with .csv).

score(X, y)[source]¶: Default interface for score function

scores(X, y, list_metrics=None)[source]¶: Default interface for scores function

set_weights(weights)[source]¶

Set the weights for the RVFL model.

Parameters:: weights (dict) – Dictionary containing the weights to set.

graforvfl.network.gfo_rvfl_tuner module¶

class graforvfl.network.gfo_rvfl_tuner.GfoRvflTuner(problem_type='regression', bounds=None, cv=5, scoring='MSE', optimizer='OriginalWOA', optimizer_paras=None, verbose=True, seed=None)[source]¶

Bases: object

Defines the Gradient Free Optimization-based Random Vector Functional Link Network.

Parameters:

problem_type (str, default="regression") – The problem type
bounds (from Mealpy library, default=None) – The boundary for RVFL hyper-parameters. It can be an instance of these classes: [FloatVar, BoolVar, StringVar, IntegerVar, PermutationVar, BinaryVar, MixedSetVar]
cv (int, default=5) – The k fold cross-validation method.
scoring (str) – The name of objective for the problem, also depend on the problem is classification and regression.
optimizer (str or instance of Optimizer class (from Mealpy library), default = "BaseGA") – The Metaheuristic Algorithm that use to solve the feature selection problem. Current supported list, please check it here: https://github.com/thieu1995/mealpy. If a custom optimizer is passed, make sure it is an instance of Optimizer class.
optimizer_paras (None or dict of parameter, default=None) – The parameter for the optimizer object. If None, the default parameters of optimizer is used (defined in https://github.com/thieu1995/mealpy.) If dict is passed, make sure it has at least epoch and pop_size parameters.
verbose (bool, default=False) – Whether to print progress messages to stdout.
seed (int, default=None) – Determines random number generation for weights and bias initialization. Pass an int for reproducible results across multiple function calls.

Examples

>>> from sklearn.datasets import load_breast_cancer
>>> from mealpy import StringVar, IntegerVar
>>> from graforvfl import Data, GfoRvflTuner

>>> ## Load data object
>>> X, y = load_breast_cancer(return_X_y=True)
>>> data = Data(X, y)

>>> ## Split train and test
>>> data.split_train_test(test_size=0.2, random_state=2, inplace=True)
>>> print(data.X_train.shape, data.X_test.shape)

>>> ## Scaling dataset
>>> data.X_train, scaler_X = data.scale(data.X_train, scaling_methods=("standard", "minmax"))
>>> data.X_test = scaler_X.transform(data.X_test)

>>> data.y_train, scaler_y = data.encode_label(data.y_train)
>>> data.y_test = scaler_y.transform(data.y_test)

>>> # Design the boundary (parameters)
>>> my_bounds = [
>>>     IntegerVar(lb=2, ub=1000, name="size_hidden"),
>>>     StringVar(valid_sets=("none", "relu", "leaky_relu", "celu", "prelu", "gelu",
>>>         "elu", "selu", "rrelu", "tanh", "sigmoid"), name="act_name"),
>>>     StringVar(valid_sets=("orthogonal", "he_uniform", "he_normal", "glorot_uniform", "glorot_normal",
>>>         "lecun_uniform", "lecun_normal", "random_uniform", "random_normal"), name="weight_initializer")
>>> ]

>>> opt_paras = {"name": "WOA", "epoch": 10, "pop_size": 20}
>>> model = GfoRvflTuner(problem_type="classification", bounds=my_bounds, cv=3, scoring="AS",
>>>                   optimizer="OriginalWOA", optimizer_paras=opt_paras, verbose=True, seed=42)
>>> model.fit(data.X_train, data.y_train)
>>> print(model.best_params)
>>> print(model.best_estimator)
>>> print(model.best_estimator.scores(data.X_test, data.y_test, list_metrics=("PS", "RS", "NPV", "F1S", "F2S")))

SUPPORTED_CLS_METRICS = {'AS': 'max', 'BSL': 'min', 'CEL': 'min', 'CKS': 'max', 'F1S': 'max', 'F2S': 'max', 'FBS': 'max', 'GINI': 'min', 'GMS': 'max', 'HL': 'min', 'HS': 'max', 'JSI': 'max', 'KLDL': 'min', 'LS': 'max', 'MCC': 'max', 'NPV': 'max', 'PS': 'max', 'ROC-AUC': 'max', 'RS': 'max', 'SS': 'max'}¶

SUPPORTED_REG_METRICS = {'A10': 'max', 'A20': 'max', 'A30': 'max', 'ACOD': 'max', 'APCC': 'max', 'AR': 'max', 'AR2': 'max', 'CI': 'max', 'COD': 'max', 'COR': 'max', 'COV': 'max', 'CRM': 'min', 'DRV': 'min', 'EC': 'max', 'EVS': 'max', 'GINI': 'min', 'GINI_WIKI': 'min', 'JSD': 'min', 'KGE': 'max', 'MAAPE': 'min', 'MAE': 'min', 'MAPE': 'min', 'MASE': 'min', 'ME': 'min', 'MRB': 'min', 'MRE': 'min', 'MSE': 'min', 'MSLE': 'min', 'MedAE': 'min', 'NNSE': 'max', 'NRMSE': 'min', 'NSE': 'max', 'OI': 'max', 'PCC': 'max', 'PCD': 'max', 'R': 'max', 'R2': 'max', 'R2S': 'max', 'RAE': 'min', 'RMSE': 'min', 'RSE': 'min', 'RSQ': 'max', 'SMAPE': 'min', 'VAF': 'max', 'WI': 'max'}¶

fit(X, y)[source]¶

static load_model(load_path='history', filename='network.pkl')[source]¶

Load a saved model from a pickle file.

Parameters:

load_path (str, default="history") – Directory containing the saved file.
filename (str, default="network.pkl") – Name of the file (must end with .pkl).

Returns:

model – Loaded model instance.

Return type:

BaseRVFL

predict(X)[source]¶

save_convergence(save_path='history', filename='convergence.csv')[source]¶

Save the convergence (fitness value) during the training process to csv file.

Parameters:

save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".csv" extension) –

save_model(save_path='history', filename='network.pkl')[source]¶

Save network to pickle file

Parameters:

save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".pkl" extension) –

save_performance_metrics(y_true, y_pred, list_metrics=('RMSE', 'MAE'), save_path='history', filename='metrics.csv')[source]¶

Save evaluation metrics to csv file

Parameters:

y_true (ground truth data) –
y_pred (predicted output) –
list_metrics (list of evaluation metrics) –
save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".csv" extension) –

save_y_predicted(X, y_true, save_path='history', filename='y_predicted.csv')[source]¶

Save the predicted results to csv file

Parameters:

X (The features data, nd.ndarray) –
y_true (The ground truth data) –
save_path (saved path (relative path, consider from current executed script path)) –
filename (name of the file, needs to have ".csv" extension) –

class graforvfl.network.gfo_rvfl_tuner.HyperparameterProblem(bounds=None, minmax='max', X=None, y=None, model_class=None, metric_class=None, obj_name=None, cv=5, seed=None, **kwargs)[source]¶

Bases: Problem

This class defines the Hyper-parameter tuning problem that will be used for Mealpy library.

Parameters:

bounds (from Mealpy library.) –
minmax (from Mealpy library.) –
X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
model_class (RvflRegressor or RvflClassifier) – The class definition of RVFL network for regression or classification problem.
metric_class (RegressionMetric or ClassificationMetric) – The class definition of Performance Metrics for regression or classification problem.
obj_name (str) – The name of the loss function used in network
cv (int, default=5) – The k fold cross-validation method
seed (int, default=None) – Determines random number generation for weights and bias initialization. Pass an int for reproducible results across multiple function calls.

obj_func(x)[source]¶

Objective function

Parameters:: x (numpy.ndarray) – Solution.
Returns:: Function value of x.
Return type:: float

graforvfl.network.standard_rvfl module¶

class graforvfl.network.standard_rvfl.RvflClassifier(size_hidden=10, act_name='sigmoid', weight_initializer='random_normal', trainer='MPI', alpha=0.5, seed=None)[source]¶

Bases: BaseRVFL, ClassifierMixin

Defines the general class of Metaheuristic-based ELM network for Classification problems that inherit the BaseRVFL and ClassifierMixin classes.

Parameters:

size_hidden (int, default=10) – The number of hidden nodes
act_name (str, default="sigmoid") – The activation of the hidden layer. The supported values are: [“none”, “relu”, “leaky_relu”, “celu”, “prelu”, “gelu”, “elu”, “selu”, “rrelu”, “tanh”, “hard_tanh”, “sigmoid”, “hard_sigmoid”, “log_sigmoid”, “silu”, “swish”, “hard_swish”, “soft_plus”, “mish”, “soft_sign”, “tanh_shrink”, “soft_shrink”, “hard_shrink”, “softmin”, “softmax”, “log_softmax” ]
weight_initializer (str, default="random_uniform") – The weight initialization methods. The supported methods are: [“orthogonal”, “he_uniform”, “he_normal”, “glorot_uniform”, “glorot_normal”, “lecun_uniform”, “lecun_normal”, “random_uniform”, “random_normal”] For definition of these methods, please check it at: https://keras.io/api/layers/initializers/
trainer (str, default = "MPI") –
The utilized method for training weights of hidden-output layer and weights of input-output layer.
- MPI: Moore-Penrose inversion (Ordinary Least Squares without regularization)
- L2: Ordinary Least Squares (OLS) regression with regularization
alpha (float (Optional), default=0.5) – The penalty value for L2 method. Only effect when `trainer`=”L2”.
seed (int, default=None) – Determines random number generation for weights and bias initialization. Pass an int for reproducible results across multiple function calls.

Examples

>>> from graforvfl import Data, RvflClassifier
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=100, random_state=1)
>>> data = Data(X, y)
>>> data.split_train_test(test_size=0.2, random_state=1)
>>> model = RvflClassifier(size_hidden=10, act_name='sigmoid', weight_initializer="random_normal", trainer="OLS", alpha=0.5, seed=42)
>>> model.fit(data.X_train, data.y_train)
>>> pred = model.predict(data.X_test)
>>> print(pred)
array([1, 0, 1, 0, 1])

CLS_OBJ_LOSSES = ['CEL', 'HL', 'KLDL', 'BSL']¶

evaluate(y_true, y_pred, list_metrics=('AS', 'RS'))[source]¶

Return the list of classification performance metrics of the prediction.

Parameters:

y_true (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
y_pred (array-like of shape (n_samples,) or (n_samples, n_outputs)) – Predicted values for X.
list_metrics (list) – You can get classification metrics from Permetrics library: https://permetrics.readthedocs.io/en/latest/pages/classification.html

Returns:

results – The results of the list metrics

Return type:

dict

fit(X, y)[source]¶

Fit the RVFL model to the training data.

Parameters:

X (ndarray of shape (n_samples, n_features)) – Training input features.
y (ndarray of shape (n_samples,) or (n_samples, n_outputs)) – Target values.

Returns:

self – The fitted model.

Return type:

BaseRVFL

predict(X)[source]¶

Predict target values using the fitted RVFL model.

Parameters:: X (ndarray of shape (n_samples, n_features)) – Input data.
Returns:: y_pred – Predicted target values.
Return type:: ndarray

predict_proba(X)[source]¶

Predict probabilities (or scores) for classification tasks.

Parameters:: X (ndarray of shape (n_samples, n_features)) – Input data.
Returns:: y_pred – Predicted probabilities or scores.
Return type:: ndarray

score(X, y)[source]¶

Return the real Accuracy Score metric

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.

Returns:

result – The result of selected metric

Return type:

float

scores(X, y, list_metrics=('AS', 'RS'))[source]¶

Return the list of classification metrics of the prediction.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
list_metrics (list, default=("AS", "RS")) – You can get classification metrics from Permetrics library: https://permetrics.readthedocs.io/en/latest/pages/classification.html

Returns:

results – The results of the list metrics

Return type:

dict

class graforvfl.network.standard_rvfl.RvflRegressor(size_hidden=10, act_name='sigmoid', weight_initializer='random_normal', trainer='MPI', alpha=0.5, seed=None)[source]¶

Bases: BaseRVFL, RegressorMixin

Defines the ELM network for Regression problems that inherit the BaseRVFL and RegressorMixin classes.

Parameters:

size_hidden (int, default=10) – The number of hidden nodes
act_name (str, default="sigmoid") – The activation of the hidden layer. The supported values are: [“none”, “relu”, “leaky_relu”, “celu”, “prelu”, “gelu”, “elu”, “selu”, “rrelu”, “tanh”, “hard_tanh”, “sigmoid”, “hard_sigmoid”, “log_sigmoid”, “silu”, “swish”, “hard_swish”, “soft_plus”, “mish”, “soft_sign”, “tanh_shrink”, “soft_shrink”, “hard_shrink”, “softmin”, “softmax”, “log_softmax” ]
weight_initializer (str, default="random_uniform") – The weight initialization methods. The supported methods are: [“orthogonal”, “he_uniform”, “he_normal”, “glorot_uniform”, “glorot_normal”, “lecun_uniform”, “lecun_normal”, “random_uniform”, “random_normal”] For definition of these methods, please check it at: https://keras.io/api/layers/initializers/
trainer (str, default = "MPI") –
The utilized method for training weights of hidden-output layer and weights of input-output layer.
- MPI: Moore-Penrose inversion (Ordinary Least Squares without regularization)
- L2: Ordinary Least Squares (OLS) regression with regularization
alpha (float (Optional), default=0.5) – The penalty value for L2 method. Only effect when `trainer`=”L2”.
seed (int, default=None) – Determines random number generation for weights and bias initialization. Pass an int for reproducible results across multiple function calls.

Examples

>>> from graforvfl import RvflRegressor, Data
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(n_samples=200, random_state=1)
>>> data = Data(X, y)
>>> data.split_train_test(test_size=0.2, random_state=1)
>>> model = RvflRegressor(size_hidden=10, act_name='sigmoid', weight_initializer="random_normal", trainer="OLS", alpha=0.5, seed=42)
>>> model.fit(data.X_train, data.y_train)
>>> pred = model.predict(data.X_test)
>>> print(pred)

evaluate(y_true, y_pred, list_metrics=('MSE', 'MAE'))[source]¶

Return the list of performance metrics of the prediction.

Parameters:

y_true (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
y_pred (array-like of shape (n_samples,) or (n_samples, n_outputs)) – Predicted values for X.
list_metrics (list) – You can get metrics from Permetrics library: https://github.com/thieu1995/permetrics

Returns:

results – The results of the list metrics

Return type:

dict

score(X, y)[source]¶

Return the real R2 (Coefficient of Determination) metric, not (Pearson’s Correlation Index)^2 like Scikit-Learn library.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.

Returns:

result – The result of selected metric

Return type:

float

scores(X, y, list_metrics=('MSE', 'MAE'))[source]¶

Return the list of regression metrics of the prediction.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
list_metrics (list, default=("MSE", "MAE")) – You can get regression metrics from Permetrics library: https://permetrics.readthedocs.io/en/latest/pages/regression.html

Returns:

results – The results of the list metrics

Return type:

dict