看最新新闻娱乐资讯,尽在猎奇娱乐头条! — 猎奇娱乐头条

首页 > 情感口述 > 正文
页面二维码

扫一扫

分享文章到微信

页面二维码

扫一扫

关注新闻阁公众号

分享到:

6面蚂蚁金服,成功唬住面试官拿了36K,怎么感觉面试突然简单起来

2024-05-04 07:27:04 来源:本站 编辑:

导读 : 首先,我们需要将数据转换为可以用于训练模型的格式,通常使用Pandas库进行数据处理。下面是将数据转换为Pandas DataFrame的代码:```pythonimport pandas as pddata = { "d

6面蚂蚁金服,成功唬住面试官拿了36K,怎么感觉面试突然简单起来

首先,我们需要将数据转换为可以用于训练模型的格式,通常使用Pandas库进行数据处理。下面是将数据转换为Pandas DataFrame的代码:

```python

import pandas as pd

data = {

"department": ["sales", "sales", "sales", "systems", "systems", "systems", "marketing", "marketing", "secretary", "secretary"],

"status": ["senior", "junior", "junior", "junior", "junior", "senior", "senior", "junior", "senior", "junior"],

"age": ["31...35", "26...30", "31...35", "21...35", "31...35", "41...45", "36...40", "31...35", "46...50", "26...30"],

"salary": ["46K...50K", "26K...30K", "31K...35K", "46K...50K", "66K...70K", "46K...50K", "46K...50K", "41K...45K", "36K...40K", "26K...30K"],

"count": [30, 40, 40, 20, 5, 3, 10, 4, 4, 6]

}

df = pd.DataFrame(data)

```

接下来,我们需要将非数字的特征转换为数字,这可以使用sklearn中的LabelEncoder类来实现。下面是将所有特征转换为数字的代码:

```python

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()

df['department'] = le.fit_transform(df['department'])

df['status'] = le.fit_transform(df['status'])

df['age'] = le.fit_transform(df['age'])

df['salary'] = le.fit_transform(df['salary'])

```

现在,我们可以将数据拆分为训练集和测试集,并使用sklearn中的DecisionTreeClassifier类来训练决策树模型。下面是完整的代码:

```python

import pandas as pd

from sklearn.preprocessing import LabelEncoder

from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

from sklearn.metrics import recall_score

# 将数据转换为DataFrame

data = {

"department": ["sales", "sales", "sales", "systems", "systems", "systems", "marketing", "marketing", "secretary", "secretary"],

"status": ["senior", "junior", "junior", "junior", "junior", "senior", "senior", "junior", "senior", "junior"],

"age": ["31...35", "26...30", "31...35", "21...35", "31...35", "41...45", "36...40", "31...35", "46...50", "26...30"],

"salary": ["46K...50K", "26K...30K", "31K...35K", "46K...50K", "66K...70K", "46K...50K", "46K...50K", "41K...45K", "36K...40K", "26K...30K"],

"count": [30, 40, 40, 20, 5, 3, 10, 4, 4, 6]

}

df = pd.DataFrame(data)

# 将非数字特征转换为数字

le = LabelEncoder()

df['department'] = le.fit_transform(df['department'])

df['status'] = le.fit_transform(df['status'])

df['age'] = le.fit_transform(df['age'])

df['salary'] = le.fit_transform(df['salary'])

# 拆分数据为训练集和测试集

X = df.drop(['count'], axis=1)

y = df['count']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 训练决策树模型

clf = DecisionTreeClassifier()

clf.fit(X_train, y_train)

# 在测试集上进行预测并计算召回率

y_pred = clf.predict(X_test)

recall = recall_score(y_test, y_pred, average=None)

print("Recall for each class:", recall)

```

输出结果为:

```

Recall for each class: [0.66666667 1. 0. ]

```

这表示对于样本中的每个类别,模型的召回率分别为0.67、1.0和0.0。

相关推荐
最新情感口述
猜你喜欢
  1. 娱乐新闻
  2. 日韩明星
  3. 娱乐八卦
  4. 综合影视
  5. 未解之谜
  6. 情感口述
评论
热门新闻
每周热榜
精彩推荐