Sums of data types should be preferred over "bucket" data types with a `thingType` field
Which are the tradeoffs of these two? Is one better overall?
User
data type contains userType :: UserType
field with Sum type to differentiate
types
data UserType = FreeUser | PaidUser
data User =
User { userId :: Int
, userName :: String
, userType :: UserType
, userRewardPoints :: Maybe Int
}
data ArticleType = FreeArticle | PaidArticle
data Article =
Article
{ articleId :: Int
, articleName :: Text
, articleTitle :: Text
, articleType :: ArticleType
}
case
serveArticle :: ArticleID -> Servant.Handler
serveArticle aId = do
-- ... snip ...
a <- runDB $ getArticle aId
case userType u of
PaidUser ->
case articleType a of
PaidArticle -> do
runDB $ upsert user { userRewardPoints = fmap (+ 5) (userRewardPoints u) }
displayArticle a
FreeArticle -> do
runDB $ upsert user { userRewardPoints = fmap (+ 2) (userRewardPoints u) }
displayArticle a
FreeUser -> case articleType a of
PaidArticle -> Http400 "denied"
FreeArticle -> displayArticle a
User
is a sum type containing different entire different Users and their sometimes duplicate contents
types
data FreeUser =
FreeUser { freeUserId :: Int
, freeUserName :: String
}
data PaidUser =
PaidUser { paidUserId :: Int
, paidUserName :: String
, paidUserRewardPoints :: Int
}
data User = Free FreeUser
| Paid PaidUser
data FreeArticle =
FreeArticle { freeArticleId :: Int
, freeArticleName :: String
}
data PaidArticle =
PaidArticle { paidArticleId :: Int
, paidArticleName :: String
}
data Article = Free FreeArticle | Paid PaidArticle
case
serveArticle :: ArticleID -> Servant.Handler
serveArticle aId = do
-- ... snip ...
a <- runDB $ getArticle aId
case user of
Paid pu ->
case a of
PaidArticle pa -> do
runDB $ upsert user { paidUserRewardPoints = (+ 5) (paidUserRewardPoints pu) }
displayArticle pa
FreeArticle fa -> do
runDB $ upsert user { paidUserRewardPoints = (+ 2) (paidUserRewardPoints pu) }
displayArticle fa
Free _ -> case articleType a of
PaidArticle _ -> Http400 "denied"
FreeArticle fa -> displayArticle fa
Comparing the calling code
We case differently on both user and article with these approaches. Does either seem advantageous?
serveArticle :: ArticleID -> Servant.Handler
serveArticle aId = do
-- ... snip ...
a <- runDB $ getArticle aId
case userType u of
PaidUser ->
case articleType a of
PaidArticle -> do
runDB $ upsert user { userRewardPoints = fmap (+ 5) (userRewardPoints u) }
displayArticle a
FreeArticle -> do
runDB $ upsert user { userRewardPoints = fmap (+ 2) (userRewardPoints u) }
displayArticle a
FreeUser -> case articleType a of
PaidArticle -> Http400 "denied"
FreeArticle -> displayArticle a
serveArticle :: ArticleID -> Servant.Handler
serveArticle aId = do
-- ... snip ...
a <- runDB $ getArticle aId
case u of
PaidUser pu ->
case a of
PaidArticle pa -> do
runDB $ upsert user { paidUserRewardPoints = fmap (+ 5) (paidUserRewardPoints pu) }
displayArticle pa
FreeArticle fa -> do
runDB $ upsert user { paidUserRewardPoints = fmap (+ 2) (paidUserRewardPoints pu) }
displayArticle fa
FreeUser _ -> case articleType a of
PaidArticle _ -> Http400 "denied"
FreeArticle fa -> displayArticle fa
Edit: A suggestion from a user on the Haskell discord that is correct by construction and ergonomic
types
data UserType = FreeUser | PaidUser PaidUserData
data User =
User { userId :: Int
, userName :: String
, userType :: UserType
}
data PaidUserData = PaidUserData
{ userRewardPoints :: Int
}
data FreeArticle =
FreeArticle { freeArticleId :: Int
, freeArticleName :: String
}
data PaidArticle =
PaidArticle { paidArticleId :: Int
, paidArticleName :: String
}
data Article = Free FreeArticle | Paid PaidArticle
this is correct by construction by moving the variation into the sum type
User { userId = 0
, userName "h4x0r"
, userType = FreeUser
}
TODO case
serveArticle :: ArticleID -> Servant.Handler
serveArticle aId = do
-- ... snip ...
a <- runDB $ getArticle aId
case u of
PaidUser pu ->
case a of
PaidArticle pa -> do
runDB $ upsert user { paidUserRewardPoints = (+ 5) (paidUserRewardPoints pu) }
displayArticle pa
FreeArticle fa -> do
runDB $ upsert user { paidUserRewardPoints = (+ 2) (paidUserRewardPoints pu) }
displayArticle fa
FreeUser _ -> case articleType a of
PaidArticle _ -> Http400 "denied"
FreeArticle fa -> displayArticle fa
my commentary
User
is not correct by construction with the userType example
User { userId = 0
, userName "h4x0r"
, userType = FreeUser
, userRewardPoints = Just 10000 -- free users shouldn't have reward points
}
User
is correct by construction with the sum containing different Users
Free (FreeUser
{ freeUserId = 0
, freeUserName "valid free"
})
Paid (PaidUser
{ paidUserId = 1
, paidUserName "valid paid"
, paidUserRewardPoints = 10000
})
TODO conclusion
sum of different types only really big advantage is being correct by construction
Another advantage stemming from that is preventing a proliferation of optional fields muddying the purpose of individual types
- this typically has a side effect of having to laboriously validate this type in calling code over and over, hiding other codes intention infectiously