我有一个数据集,并使用pd.get_dummies对目标列(整个列中有5个不同的字符串)进行了热编码。然后我使用sklearn的train_test_split函数来创建训练、测试和验证集。然后使用standardScalar()规范化训练集(特性)。我已经将特征和目标的训练集拟合到逻辑回归模型中。
我现在正在尝试计算训练、验证和测试集的准确度分数,但我没有运气。此部分的代码如下:
dataset = pd.read_csv('tabular_data/clean_tabular_data.csv') features, label = load_airbnb(dataset, 'Category') label_series = dataset['Category'] label_encoded = pd.get_dummies(label_series) print(label_encoded.shape) print(label_encoded) X_train, X_test, y_train, y_test = train_test_split(features, label_encoded, test_size=0.3) X_test, X_validation, y_test, y_validation = train_test_split(X_test, y_test, test_size=0.5) # normalize the features scaler = StandardScaler() scaler.fit(X_train) X_train_scaled = scaler.transform(X_train) X_validation_scaled = scaler.transform(X_validation) X_test_scaled = scaler.transform(X_test) # get baseline classification model model = LogisticRegression() print(y_train) print(X_train_scaled.shape) y_train = y_train.iloc[:, 0] print(y_train.shape) model.fit(X_train_scaled, y_train) y_train_pred = model.predict(X_train_scaled) y_train_pred = np.argmax(y_train_pred, axis=0) y_validation_pred = model.predict(X_validation_scaled) y_validation_pred = np.argmax(y_validation_pred, axis =0) y_test_pred = model.predict(X_test_scaled) y_test_pred = np.argmax(y_test_pred, axis = 0) # evaluate model using accuracy train_acc = accuracy_score(y_train, y_train_pred) test_acc = accuracy_score(y_test, y_test_pred) validation_acc = accuracy_score(y_validation, y_validation_pred)
我收到的错误如下:“文件”C:\Users\lcox1\Documents\VSCode\AiCore\Data science\classification_prac.py“,第56行,in train_acc=accuracy_score(y_train,y_train_pred)
TypeError:不能将Singleton数组16视为有效集合。"
我对python相当陌生,所以不知道问题是什么。任何帮助都很感谢。