Skip to content

Latest commit

Β 

History

History
456 lines (361 loc) Β· 21.1 KB

README.md

File metadata and controls

456 lines (361 loc) Β· 21.1 KB

59️⃣59️⃣-deeplearning-project

🦁 Likelion AI SCHOOL7 πŸ‘ΆπŸ€± μœΌμƒ€μœΌμƒ€νŒ€3 9μ‘° 였9οΈβƒ£μ˜€9️⃣ νŒ€

πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ Team Info.

name github velog&blog
κΉ€μ˜ˆμ§€πŸ‘‘
이정은
쑰예슬
μž„μ’…μš°
κΆŒνƒœμœ€

πŸ’‘ Project Info.

number title link bestscore
1 πŸ₯¬ Prediction of bok choy growth https://dacon.io/competitions/official/235961/overview/description private mse 17.53
2 😷 Face Mask Classification https://www.kaggle.com/datasets/dhruvmak/face-mask-detection accuracy 0.97
3 πŸ“ Sentences type Classification https://dacon.io/competitions/official/236037/overview/description private accuracy 0.7559

1. πŸ₯¬ Prediction of bok choy growth πŸ₯¬

image

πŸ† dacon AI κ²½μ§„λŒ€νšŒ : 청경채 μ„±μž₯λ₯  μ˜ˆμΈ‘ν•˜κΈ°
μ£Όμ†Œ : https://dacon.io/competitions/official/235961/overview/description

πŸ“œ notion : https://www.notion.so/MINI5-AI-b031d68247e24a30b192b24c522284d1

πŸ“ƒ summary

4μ°¨ μ‚°μ—…ν˜λͺ… μ‹œλŒ€λ₯Ό λ§žμ•„ 농업 λΆ„μ•Όμ—μ„œλ„ AI 기술이 널리 μ‚¬μš©λ˜μ–΄ IT κΈ°μˆ μ„ λ™μ›ν•œ 슀마트팜 λ“± λ”μš± 효율적인 μž‘λ¬Ό μž¬λ°°κ°€ κ°€λŠ₯해지고 μžˆμŠ΅λ‹ˆλ‹€. μž‘λ¬Όμ˜ 효율적인 μƒμœ‘μ„ μœ„ν•œ 졜적의 ν™˜κ²½μ„ λ„μΆœν•œλ‹€λ©΄ 식물 μž¬λ°°μ— 큰 도움이 될 것이며, 청경채 뿐만 μ•„λ‹Œ λͺ¨λ“  μž‘λ¬Ό 재배율이 μ’‹μ•„μ§ˆ κ²ƒμž…λ‹ˆλ‹€. 미래의 μž‘λ¬Ό μž¬λ°°μ—μ„œλŠ” 이 데이터λ₯Ό 가지고 인곡지λŠ₯을 μ΄μš©ν•˜μ—¬ μž‘λ¬Όλ³„ λ§žμΆ€ν˜• μ†”λ£¨μ…˜μ„ 농업인듀이 νŽΈλ¦¬ν•˜κ³  μΉœκ·Όν•˜κ²Œ μƒν™œ μ†μ—μ„œ ν™œμš©ν•˜λŠ” 첫 κ±ΈμŒμ„ 내딛을 수 μžˆμ„ κ²ƒμž…λ‹ˆλ‹€.

λ”°λΌμ„œ, 인곡지λŠ₯(AI)을 ν™œμš©ν•˜μ—¬ κ΅­λ‚΄ 고유 식물 μžμ›μ—μ„œ μœ μš©ν•œ μ²œμ—°λ¬Ό μ†Œμž¬λ₯Ό νƒμƒ‰ν•˜κ³ , κ·Έ 효λŠ₯κ³Ό ν™œμ„± 등에 λŒ€ν•΄ μ—°κ΅¬ν•˜λŠ” 것이 λͺ©ν‘œμž…λ‹ˆλ‹€.
μ‹€μ œ AIλ₯Ό μ΄μš©ν•œ μž‘λ¬Όμ„ μž¬λ°°ν•˜λŠ” 슀마트팜과 같은 곳에 μœ μš©ν•˜κ²Œ μ‚¬μš©λ  κ²ƒμž…λ‹ˆλ‹€.

πŸ—‚ Data info.

dacon 청경채 예츑 데이터 : https://dacon.io/competitions/official/235961/data

πŸ“ train input dataset[folder]
image
총 58개 청경채 μΌ€μ΄μŠ€λ₯Ό 각 청경채 μΌ€μ΄μŠ€ 별 ν™˜κ²½ 데이터(1λΆ„ 간격)으둜 κ΅¬μ„±λ˜μ–΄ 있음

πŸ“ train target dataset[folder]

총 58개 청경채 μΌ€μ΄μŠ€λ₯Ό rate column의 각 청경채 μΌ€μ΄μŠ€ 별 잎 면적 증감λ₯ (1일 간격)둜 κ΅¬μ„±λ˜μ–΄ 있음

πŸ“‚ train(input+target) shape
train(input+target) (1813, 43)
test(input+target) (195, 43)

πŸ“Š Visualization

1️⃣ λ‚΄λΆ€μ˜¨λ„κ΄€μΈ‘μΉ˜, λ‚΄λΆ€μŠ΅λ„κ΄€μΈ‘μΉ˜, μ΄μΆ”μ •κ΄‘λŸ‰, 월별 rate
image

2️⃣ 적색, 청색, 백색, 총좔 μΆ”μ •κ΄‘λŸ‰ 별 rate image
백색과 μ΄μΆ”λŠ” 100μ—μ„œ, 적색과 청색은 0μ—μ„œ μ„±μž₯λ₯ μ΄ λ†’λ‹€.

3️⃣ EC와 CO2의 λƒ‰λ°©μƒνƒœ
image
ECκ΄€μΈ‘μΉ˜κ°€ 클수둝 λƒ‰λ°©μƒνƒœλŠ” μ μ—ˆμœΌλ©°, λ°˜λŒ€λ‘œ μž‘μ„μˆ˜λ‘ λƒ‰λ°©μƒνƒœλŠ” 높은 것을 확인할 수 μžˆλ‹€.

4️⃣ 각 CASE 별 잎면적 증감λ₯ (rate) λ³€ν™”
image
뢄포λ₯Ό μΌμ •ν•˜κ²Œ λ§Œλ“€κΈ° μœ„ν•œ Scaling 과정이 ν•„μš”ν•¨μ΄ λ³΄μ˜€λ‹€.

πŸ” Modeling

πŸ“Œ Scaling - RobustScaler

from sklearn.preprocessing import RobustScaler        
rb = RobustScaler()         
train_X = rb.fit_transform(train_X)         
test_X = rb.transform(test_X)            

πŸ“Œ Tensorflow

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units=128, input_shape=[input_shape]),
    tf.keras.layers.Dense(128, activation='selu'),
    tf.keras.layers.Dropout(0.1),
    tf.keras.layers.Dense(1)
])
  
optimizer = tf.keras.optimizers.RMSprop(0.001)     
  
model.compile(optimizer=optimizer, 
              loss=["mae", "mse"], 
              metrics=["mae", "mse"])

πŸ“Œ Pytorch

linear1 = torch.nn.Linear(train_X.shape[1], 512, bias=True)
linear3 = torch.nn.Linear(512, 256, bias=True)
linear4 = torch.nn.Linear(256, 128, bias=True)
linear5 = torch.nn.Linear(128, 64, bias=True)
linear6 = torch.nn.Linear(64, 32, bias=True)
linear7 = torch.nn.Linear(32, 10, bias=True)
linear8 = torch.nn.Linear(10, 1, bias=True)

relu = torch.nn.ReLU()
dropout = torch.nn.Dropout(p=0.1)
  
model = torch.nn.Sequential(linear1,relu,
                            linear3,relu,
                            linear4, relu,
                            linear5, relu,
                            linear6, relu,
                            linear7, relu,
                            linear8).to(device)

# nn νŒ¨ν‚€μ§€λ₯Ό μ‚¬μš©ν•˜μ—¬ λͺ¨λΈκ³Ό 손싀 ν•¨μˆ˜λ₯Ό μ •μ˜ν•©λ‹ˆλ‹€.
loss_fn = torch.nn.MSELoss().to(device)
optimizer = optim.Adam(model.parameters(), lr=0.0005)

# κ°€μ€‘μΉ˜ μ΄ˆκΈ°ν™” 
torch.nn.init.xavier_normal_(linear1.weight)
torch.nn.init.xavier_normal_(linear2.weight)
torch.nn.init.xavier_normal_(linear3.weight)
torch.nn.init.xavier_normal_(linear4.weight)
torch.nn.init.xavier_normal_(linear5.weight)
torch.nn.init.xavier_normal_(linear6.weight)
torch.nn.init.xavier_normal_(linear7.weight)
torch.nn.init.xavier_normal_(linear8.weight)     
  
Parameter containing:
tensor([[-0.1239,  0.3789, -0.2748,  0.2951,  0.0612,  0.2898,  0.2660, -0.4028,
         -1.1738,  0.6484]], requires_grad=True)        

πŸ“Œ LSTM

class BaseModel(nn.Module):
    def __init__(self):
        super(BaseModel, self).__init__() 
        self.lstm = nn.LSTM(input_size=train_X.shape[1], hidden_size=256, batch_first=True, bidirectional=False) # LSTM λ©”λͺ¨λ¦¬ μΆ”κ°€ 
        self.dense = nn.Sequential(
            # nn.Linear(256, 10 , torch.nn.ReLU()),
            # Layer μΆ”κ°€ 
            nn.Linear(256, 128,  torch.nn.ReLU()),
            nn.Linear(128, 64, torch.nn.ReLU()),
            nn.Linear(64, 32,  torch.nn.ReLU()),
            nn.Linear(32, 10,  torch.nn.ReLU()),
            nn.Linear(10, 1)
        )
        self.dropout = nn.Dropout(0.2)
        
    def forward(self, x):
        hidden, _ = self.lstm(x)
        x = self.dropout(hidden)
        output = self.dense(hidden)
        return output
  
model = BaseModel()
model.eval() 
loss_fn = torch.nn.MSELoss().to(device)
optimizer = optim.Adam(model.parameters(), lr=0.0005)    

πŸ€ Submission & Score (mse)

πŸ“ŒTensorflow
public : 17.91, private : 17.53
πŸ“Œ Pytorch
public : 19.2138, private : 18.009
πŸ“ŒLSTM
public : 21.5578 / private 22.5213


2. 😷 Face Mask Classification 😷

πŸ† Kaggle AI κ²½μ§„λŒ€νšŒ : 마슀크 착용/미착용 λΆ„λ₯˜
μ£Όμ†Œ : https://www.kaggle.com/datasets/dhruvmak/face-mask-detection

πŸ“œ notion : https://www.notion.so/MINI6-Mask_or_No_Mask-Classification-a7d66cebd161444180e9024e13be2f98#35d0f4877a1f4aa3bd9a0719fc5bea2d
πŸ“Œ streamlit : https://seul1230-mask-classification-main-pqg0f0.streamlit.app/ πŸ“Œ streamlit : https://imngooh-mini-streamlit-app-82wtn0.streamlit.app/

πŸ“ƒ summary

μ½”λ‘œλ‚˜19 λ°”μ΄λŸ¬μŠ€λ‘œ μΈν•œ 마슀크 착용 μ˜λ¬΄ν™”ν•˜μ˜€μ—ˆκ³ , 그에 λ”°λ₯Έ 마슀크 λ―Έμ°©μš©μžμ— λŒ€ν•œ κ³Όνƒœλ‘œ λΆ€κ³Ό λŒ€μƒμ— μ²˜ν–ˆμ—ˆλ‹€. 마슀크 착용과 미착용의 λΆ„λ₯˜λ₯Ό 톡해 λͺ¨λ‹ˆν„°λ§ν•˜λŠ” 인λ ₯을 κ°μ†Œν™”ν•˜κ³  마슀크 착용의 μ˜λ¬΄ν™”λ₯Ό 느끼고 착용λ₯ μ„ λ†’μ΄κ³ μž ν•œλ‹€.

πŸ—‚ Data info.

kaggle 마슀크 착용 μ—¬λΆ€ 이미지 데이터 : https://www.kaggle.com/datasets/dhruvmak/face-mask-detection

πŸ“ with mask[folder]

총 220개의 마슀크 μ°©μš©ν•œ μ‚¬λžŒλ“€μ˜ 이미지

πŸ“ without mask[folder]

총 220개의 마슀크 λ―Έμ°©μš©ν•œ μ‚¬λžŒλ“€μ˜ 이미지

πŸ“‚ train/valid/test shape
train_df (281, 2)
val_df (71, 2)
test_df (88, 2)

πŸ“Š Visualization

βœ… Target Ratio

πŸ” Modeling

⭐ Tensorflowλ₯Ό μ΄μš©ν•œ λͺ¨λΈλ§ ⭐
πŸ“Œ Resnet152V

# imagenet으둜 pre-trained 된 κ°€μ€‘μΉ˜ κ°’ 적용
md = ResNet152V2(include_top=False, pooling='max', 
                  weights='imagenet', input_shape=(height, width, 3))
md.trainable=False

model = models.Sequential()
model.add(md)
model.add(layers.Dense(1, activation = 'sigmoid'))

model.compile(loss='binary_crossentropy', 
              optimizer='adam', 
              metrics=['accuracy'])
              
early_stop = EarlyStopping(patience=5)

history = model.fit(train_datagen, epochs=20, 
                    validation_data=val_datagen,
                    validation_steps=len(val_datagen),
                    callbacks = [early_stop])         

πŸ“Œ VGG19

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Flatten
from tensorflow.keras.applications import vgg19

vgg = vgg19.VGG19(
    include_top = False,
    weights = 'imagenet',
    input_shape = (height, width, 3)
)

model = Sequential()
model.add(vgg)
model.add(Flatten())
model.add(Dense(1, activation = 'sigmoid'))

model.compile(
    optimizer = 'adam',
    loss = 'binary_crossentropy',
    metrics = ['accuracy']
)

from tensorflow.keras.callbacks import EarlyStopping

early_stop = EarlyStopping(monitor = 'val_loss', patience = 10)

history = model.fit(
    train_dataset,
    epochs = 100,
    validation_data = valid_dataset,
    callbacks = [early_stop]

πŸ“Œ DenseNet121

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Flatten
from tensorflow.keras.applications import densenet

densenet = densenet.DenseNet121(
    include_top = False,
    weights = 'imagenet',
    input_shape = (height, width, 3),
    pooling = 'avg'
)

modeld = Sequential()
modeld.add(densenet)
modeld.add(Flatten())
modeld.add(Dense(1, activation = 'sigmoid'))

modeld.compile(
    optimizer = 'adam',
    loss = 'binary_crossentropy',
    metrics = ['accuracy']
)

from tensorflow.keras.callbacks import EarlyStopping

early_stop = EarlyStopping(monitor = 'val_loss', patience = 10)

history = modeld.fit(
    train_dataset,
    epochs = 100,
    validation_data = valid_dataset,
    callbacks = [early_stop]
)

πŸ€ Submission & Score

πŸ“Œ Resnet152V -> Best Score

πŸ“Œ νŒ€μ›λ“€μ˜ 이미지λ₯Ό μ΄μš©ν•œ 마슀크 착용/미착용 예츑


3. πŸ“ Sentences type Classification πŸ“

image

πŸ† dacon AI κ²½μ§„λŒ€νšŒ : λ¬Έμž₯ μœ ν˜• λΆ„λ₯˜ν•˜κΈ°
μ£Όμ†Œ : https://dacon.io/competitions/official/236037/overview/description

πŸ“œ notion : https://www.notion.so/Dacon-AI-54fe914d79584be2a7bd9404ee6fa9a5

πŸ“ƒ summary

ν•˜λ£¨ 쒅일 ν•œ λ§ˆλ””λ„ ν•˜μ§€ μ•ŠλŠ” 날은 상상할 수 μžˆμ§€λ§Œ, ν•œ κΈ€μžλ„ 읽지 μ•ŠλŠ” 날은 상상할 수 μ—†λŠ” κ²ƒμ²˜λŸΌ μš°λ¦¬λŠ” μˆ˜λ§Žμ€ λ¬Έμž₯κ³Ό κΈ€ 속에 λ‘˜λŸ¬μ‹Έμ—¬ μ‚΄κ³  μžˆλ‹€. νŠΉνžˆλ‚˜ μ½”λ‘œλ‚˜19 이후, 우리 μ‚¬νšŒμ—μ„  온라인과 λΉ„λŒ€λ©΄ μ†Œν†΅μ΄ 주된 ꡐλ₯˜ 방식이 λ˜μ—ˆλ‹€. 그만큼 μš°λ¦¬λŠ” 이전보닀도 더 λ§Žμ€ λ¬Έμž₯을 읽고 μ“°λ©° 세상과 μ†Œν†΅ν•˜κ³  μžˆλ‹€. 이처럼 μˆ˜λ§Žμ€ 글듀을 AI λͺ¨λΈμ„ ν™œμš©ν•΄ ν•™μŠ΅ν•˜κ³ , λΉ λ₯΄κ²Œ λΆ„λ₯˜ν•  수 μžˆλ‹€λ©΄ μš°λ¦¬λŠ” 더 μ •κ΅ν•˜κ²Œ λΆ„λ₯˜λœ 정보λ₯Ό μ–»κ³ , 이λ₯Ό 톡해 μ–Έμ–΄κ°€ μ“°μ΄λŠ” λͺ¨λ“  μ˜μ—­μ—μ„œ 보닀 μ‚¬μš©μž μΉœν™”μ μΈ μ„œλΉ„μŠ€λ₯Ό κ²½ν—˜ν•  수 있게 될 것이닀.

λ”°λΌμ„œ, ν•œ 발 더 λ‚˜μ•„κ°€ ν•œκ΅­μ–΄ 인곡지λŠ₯ 기술 κ³ λ„ν™”μ˜ λ°œνŒμ„ λ§ˆλ ¨ν•  수 μžˆλ„λ‘ 창의적인 λ¬Έμž₯ μœ ν˜• λΆ„λ₯˜ AI λͺ¨λΈμ„ λ§Œλ“œλŠ” 것이 λͺ©ν‘œμ΄λ‹€.

πŸ—‚ Data info.

dacon 청경채 예츑 데이터 : https://dacon.io/competitions/official/236037/data

πŸ“ train dataset

λ¬Έμž₯에 λ”°λ₯Έ μœ ν˜•, κ·Ήμ„±, μ‹œμ œ, 확싀성이 κ΅¬λΆ„λ˜μ–΄ 있음 (총 72개 μ’…λ₯˜μ˜ Class 쑴재)

πŸ“ train dataset

총 7090 λ¬Έμž₯으둜 κ΅¬μ„±λ˜μ–΄ 있음

πŸ“‚ train/test shape
train (16541, 7)
test (7090, 2)

πŸ“Š Visualization

1️⃣ 각 label의 λΉˆλ„μˆ˜ μ‹œκ°ν™”

κ·Ήμ„±, μœ ν˜•, 확싀성은 ν•œ 곳으둜 μΉ˜μš°μ³μ§„ label이 μžˆλ‹€.
이에 λ”°λ₯Έ 해결책을 생각해야 ν•  것이닀.

2️⃣ ν•˜λ‚˜λ‘œ ν†΅ν•©ν•œ label의 λΉˆλ„μˆ˜ μ‹œκ°ν™”

각 클래슀 μ‚¬μ΄μ—μ„œ λΆˆκ· ν˜•μ΄ 일어남을 μ•Œ 수 μžˆλ‹€.

πŸ” Modeling

πŸ“Œ Tensorflow

early_stop = EarlyStopping(monitor='val_loss', patience=5)
embedding_dim = 256

def Model(label):
    n_class = train_labels[label].shape[1]
    
    # λͺ¨λΈ μ •μ˜
    model = Sequential()
    model.add(Embedding(input_dim = 77426,
                        output_dim = embedding_dim,
                        input_length = 80))
   
    model.add(Bidirectional(LSTM(64, return_sequences=True)))
    model.add(Bidirectional(LSTM(64, return_sequences=True)))
    model.add(LSTM(64))
    
    model.add(Dense(64, activation = 'relu'))
    model.add(Dense(n_class, activation = 'softmax'))

    model.compile(optimizer = 'adam',
                  loss = 'categorical_crossentropy',
                  metrics = ['accuracy'])
    
    # λͺ¨λΈ μ„œλ¨Έλ¦¬
    display(model.summary())

    # λͺ¨λΈ ν”ΌνŒ…
    history = model.fit(train_vec, train_labels[label], epochs = 100, validation_data = (val_vec, val_labels[label]), callbacks = [early_stop])
    df_hist = pd.DataFrame(history.history)

    print('*'*5, 'ν•™μŠ΅μ™„λ£Œ', '*'*5)
    df_hist[['loss','val_loss']].plot()
    df_hist[['accuracy','val_accuracy']].plot()
    
    # λͺ¨λΈ 평가
    print('valid 평가 κ²°κ³Ό')
    model.evaluate(val_vec, val_labels[label])

    # 예츑
    y_pred = model.predict(test_vec)
    y_predict = np.argmax(y_pred, axis = 1)
    return y_predict

πŸ“Œ pytorch

class BaseModel(nn.Module):
    def __init__(self, input_dim=9351):
        super(BaseModel, self).__init__()
        self.feature_extract = nn.Sequential(
            nn.Linear(in_features=input_dim, out_features=1024),
            nn.BatchNorm1d(1024),
            nn.LeakyReLU(),
            nn.Linear(in_features=1024, out_features=1024),
            nn.BatchNorm1d(1024),
            nn.LeakyReLU(),
            nn.Linear(in_features=1024, out_features=512),
            nn.BatchNorm1d(512),
            nn.LeakyReLU(),
        )
        self.type_classifier = nn.Sequential(
            nn.Dropout(p=0.3),
            nn.Linear(in_features=512, out_features=4),
        )
        self.polarity_classifier = nn.Sequential(
            nn.Dropout(p=0.3),
            nn.Linear(in_features=512, out_features=3),
        )
        self.tense_classifier = nn.Sequential(
            nn.Dropout(p=0.3),
            nn.Linear(in_features=512, out_features=3),
        )
        self.certainty_classifier = nn.Sequential(
            nn.Dropout(p=0.3),
            nn.Linear(in_features=512, out_features=2),
        )
            
    def forward(self, x):
        x = self.feature_extract(x)
        # λ¬Έμž₯ μœ ν˜•, κ·Ήμ„±, μ‹œμ œ, 확싀성을 각각 λΆ„λ₯˜
        type_output = self.type_classifier(x)
        polarity_output = self.polarity_classifier(x)
        tense_output = self.tense_classifier(x)
        certainty_output = self.certainty_classifier(x)
        return type_output, polarity_output, tense_output, certainty_output

πŸ€ Submission & Score

πŸ“Œ Tensorflow : 0.5794
πŸ“Œ pytorch(baseline) : 0.5362
πŸ“Œ best score : 0.7559 (λŒ€νšŒκ°€ λλ‚˜μ§€ μ•Šμ•„ λͺ¨λΈμ€ 미곡개)