1 Generate Data frame

Suppose you are a data scientist, and you get a project at a start-up company, for instance Kopi Kenangan. Let’s say, you are asking to generate the collection of any possible data set from their daily sales. If I asking you: what kind of data set that you can generate?. Here, I assume you want to provide them the following data set:

  • Id : there are 5000 transactions.
  • Date: daily 5000 transactions, start from 2018/01/01.
  • Name: create 20 random cashier names (you can use names of your classmate including your self) to cover all 5000 transactions at Kopi Kenangan.
  • City: allocate this 5000 transactions to the biggest cities in Indonesia (with the same proportion). Here I assume,
    • Jakarta
    • Bogor
    • Depok
    • Tangerang
    • Bekasi
  • Outlet: allocate this 5000 transactions in five outlets. Here I assume,
    • Outlet 1
    • Outlet 2
    • Outlet 3
    • Outlet 4
    • Outlet 5
  • Menu: generate random sales of 5000 menu items at Kopi Kenangan every day. Here, I assume,
    • Cappucino
    • Es Kopi Susu
    • Hot Caramel Latte
    • Hot Chocolate
    • Hot Red Velvet Latte
    • Ice Americano
    • Ice Berry Coffe
    • Ice Cafe Latte
    • Ice Caramel Latte
    • Ice Coffee Avocado
    • Ice Coffee Lite
    • Ice Matcha Espresso
    • Ice Matcha Latte
    • Ice Red Velvet Latte
  • Price: generate random prices (min=18000, and max=45000)
  • Discount: generate random discounts (min=0.05, and max=0.12)

2 Extraction

In this section, you are expected be able to apply a very basic data frame manipulation called Extraction. Please cover the following tasks:

  • Extract all data set or transactions at Kopi Kenangan, in the specific city for instance Jakarta.

  • Extract all data set or transactions at Kopi Kenangan, in the specific menu for instance Hot Chocolate.

  • Extract all data set or transactions at Kopi Kenangan, in the specific cashier names for instance Bakti Siregar.

  • Extract all data set or transactions at Kopi Kenangan, in the specific price for instance >=40000.

  • Add a new variable, call Total_Price to your data frame (data frame that you have done above)

  • Add a new variable, call Category_Price to your data frame (data frame that you have done above), Here, I assume: “expensive”, “so-so”, and “cheap”.

3 Renames Data Frame

Please rename all variables of your data frame (data frame that you have done above) in your language.

4 Case Study

According to your data frame, please provide me the following tasks:

  • Find out the frequency of sales of which menu items are best-selling at Kopi Kenangan Company!

  • Find out which city got the most sales at Kopi Kenangan Company!

  • Find out which city has the most discounted sales at Kopi Kenangan Company!

  • Which year were the most sales at Kopi Kenangan Company?

LS0tDQp0aXRsZTogIkxhYjQ6IFIgQmFzaWNzIg0KYXV0aG9yOiAiQmFrdGkgU2lyZWdhciwgUy5TaS4sIE0uU2MiDQpkYXRlOiAiYHIgZm9ybWF0KFN5cy5EYXRlKCksICclQiAlZCwgJVknKWAiDQpvdXRwdXQ6ICANCiAgaHRtbF9kb2N1bWVudDogDQogICAgaGlnaGxpZ2h0OiBtb25vY2hyb21lDQogICAgdGhlbWU6IHNwYWNlbGFiDQogICAgbnVtYmVyX3NlY3Rpb25zOiB5ZXMNCiAgICB0b2M6IHllcw0KICAgIHRvY19mbG9hdDogeWVzDQogICAgY29kZV9kb3dubG9hZDogeWVzDQogICAgY29kZV9mb2xkaW5nOiBoaWRlDQotLS0NCg0KYGBge3IgTG9nbywgZWNobz1GQUxTRSxmaWcuYWxpZ249J2NlbnRlcicsIG91dC53aWR0aCA9ICc0MCUnfQ0Ka25pdHI6OmluY2x1ZGVfZ3JhcGhpY3MoImh0dHBzOi8vZ2l0aHViLmNvbS9CYWt0aS1TaXJlZ2FyL2ltYWdlcy9ibG9iL21hc3Rlci9sb2dvLnBuZz9yYXc9dHJ1ZSIpDQpgYGANCg0KIyBHZW5lcmF0ZSBEYXRhIGZyYW1lDQoNClN1cHBvc2UgeW91IGFyZSBhIGRhdGEgc2NpZW50aXN0LCBhbmQgeW91IGdldCBhIHByb2plY3QgYXQgYSBzdGFydC11cCBjb21wYW55LCAgZm9yIGluc3RhbmNlIGBLb3BpIEtlbmFuZ2FuYC4gTGV0J3Mgc2F5LCB5b3UgYXJlIGFza2luZyB0byBnZW5lcmF0ZSB0aGUgY29sbGVjdGlvbiBvZiBhbnkgcG9zc2libGUgZGF0YSBzZXQgZnJvbSB0aGVpciBkYWlseSBzYWxlcy4gSWYgSSBhc2tpbmcgeW91OiB3aGF0IGtpbmQgb2YgZGF0YSBzZXQgdGhhdCB5b3UgY2FuIGdlbmVyYXRlPy4gSGVyZSwgSSBhc3N1bWUgeW91IHdhbnQgdG8gcHJvdmlkZSB0aGVtIHRoZSBmb2xsb3dpbmcgZGF0YSBzZXQ6DQoNCiogSWQgOiB0aGVyZSBhcmUgNTAwMCB0cmFuc2FjdGlvbnMuDQoqIERhdGU6IGRhaWx5IDUwMDAgdHJhbnNhY3Rpb25zLCBzdGFydCBmcm9tIDIwMTgvMDEvMDEuIA0KKiBOYW1lOiBjcmVhdGUgMjAgcmFuZG9tIGNhc2hpZXIgbmFtZXMgKHlvdSBjYW4gdXNlIG5hbWVzIG9mIHlvdXIgY2xhc3NtYXRlIGluY2x1ZGluZyB5b3VyIHNlbGYpIHRvIGNvdmVyIGFsbCA1MDAwIHRyYW5zYWN0aW9ucyBhdCBgS29waSBLZW5hbmdhbmAuICAgDQoqIENpdHk6IGFsbG9jYXRlIHRoaXMgNTAwMCB0cmFuc2FjdGlvbnMgdG8gdGhlIGJpZ2dlc3QgY2l0aWVzIGluIEluZG9uZXNpYSAod2l0aCB0aGUgc2FtZSBwcm9wb3J0aW9uKS4gSGVyZSBJIGFzc3VtZSwNCiAgLSBKYWthcnRhDQogIC0gQm9nb3INCiAgLSBEZXBvaw0KICAtIFRhbmdlcmFuZw0KICAtIEJla2FzaQ0KKiBPdXRsZXQ6IGFsbG9jYXRlIHRoaXMgNTAwMCB0cmFuc2FjdGlvbnMgaW4gZml2ZSBvdXRsZXRzLiBIZXJlIEkgYXNzdW1lLA0KICAtIE91dGxldCAxDQogIC0gT3V0bGV0IDINCiAgLSBPdXRsZXQgMw0KICAtIE91dGxldCA0DQogIC0gT3V0bGV0IDUNCiogTWVudTogZ2VuZXJhdGUgcmFuZG9tIHNhbGVzIG9mIDUwMDAgbWVudSBpdGVtcyBhdCBgS29waSBLZW5hbmdhbmAgZXZlcnkgZGF5LiBIZXJlLCBJIGFzc3VtZSwNCiAgLSBDYXBwdWNpbm8NCiAgLSBFcyBLb3BpIFN1c3UNCiAgLSBIb3QgQ2FyYW1lbCBMYXR0ZQ0KICAtIEhvdCBDaG9jb2xhdGUNCiAgLSBIb3QgUmVkIFZlbHZldCBMYXR0ZQ0KICAtIEljZSBBbWVyaWNhbm8NCiAgLSBJY2UgQmVycnkgQ29mZmUNCiAgLSBJY2UgQ2FmZSBMYXR0ZQ0KICAtIEljZSBDYXJhbWVsIExhdHRlDQogIC0gSWNlIENvZmZlZSBBdm9jYWRvDQogIC0gSWNlIENvZmZlZSBMaXRlDQogIC0gSWNlIE1hdGNoYSBFc3ByZXNzbw0KICAtIEljZSBNYXRjaGEgTGF0dGUNCiAgLSBJY2UgUmVkIFZlbHZldCBMYXR0ZQ0KKiBQcmljZTogZ2VuZXJhdGUgcmFuZG9tIHByaWNlcyAobWluPTE4MDAwLCBhbmQgbWF4PTQ1MDAwKQ0KKiBEaXNjb3VudDogZ2VuZXJhdGUgcmFuZG9tIGRpc2NvdW50cyAobWluPTAuMDUsIGFuZCBtYXg9MC4xMikNCg0KYGBge3J9DQoNCmBgYA0KDQoNCiMgRXh0cmFjdGlvbg0KDQpJbiB0aGlzIHNlY3Rpb24sIHlvdSBhcmUgZXhwZWN0ZWQgYmUgYWJsZSB0byBhcHBseSBhIHZlcnkgYmFzaWMgZGF0YSBmcmFtZSBtYW5pcHVsYXRpb24gY2FsbGVkIEV4dHJhY3Rpb24uIFBsZWFzZSBjb3ZlciB0aGUgZm9sbG93aW5nIHRhc2tzOg0KDQoqIEV4dHJhY3QgYWxsIGRhdGEgc2V0IG9yIHRyYW5zYWN0aW9ucyBhdCBgS29waSBLZW5hbmdhbmAsIGluIHRoZSBzcGVjaWZpYyBjaXR5IGZvciBpbnN0YW5jZSBgSmFrYXJ0YWAuIA0KYGBge3J9DQoNCmBgYA0KDQoqIEV4dHJhY3QgYWxsIGRhdGEgc2V0IG9yIHRyYW5zYWN0aW9ucyBhdCBgS29waSBLZW5hbmdhbmAsIGluIHRoZSBzcGVjaWZpYyBtZW51IGZvciBpbnN0YW5jZSBgSG90IENob2NvbGF0ZWAuIA0KYGBge3J9DQoNCmBgYA0KDQoqIEV4dHJhY3QgYWxsIGRhdGEgc2V0IG9yIHRyYW5zYWN0aW9ucyBhdCBgS29waSBLZW5hbmdhbmAsIGluIHRoZSBzcGVjaWZpYyBjYXNoaWVyIG5hbWVzIGZvciBpbnN0YW5jZSBgQmFrdGkgU2lyZWdhcmAuDQpgYGB7cn0NCg0KYGBgDQoNCiogRXh0cmFjdCBhbGwgZGF0YSBzZXQgb3IgdHJhbnNhY3Rpb25zIGF0IGBLb3BpIEtlbmFuZ2FuYCwgaW4gdGhlIHNwZWNpZmljIHByaWNlIGZvciBpbnN0YW5jZSBgPj00MDAwMGAuDQpgYGB7cn0NCg0KYGBgDQoNCiogQWRkIGEgbmV3IHZhcmlhYmxlLCBjYWxsIGBUb3RhbF9QcmljZWAgdG8geW91ciBkYXRhIGZyYW1lICgqKmRhdGEgZnJhbWUgdGhhdCB5b3UgaGF2ZSBkb25lIGFib3ZlKiopDQpgYGB7cn0NCg0KYGBgDQoNCiogQWRkIGEgbmV3IHZhcmlhYmxlLCBjYWxsIGBDYXRlZ29yeV9QcmljZWAgdG8geW91ciBkYXRhIGZyYW1lICgqKmRhdGEgZnJhbWUgdGhhdCB5b3UgaGF2ZSBkb25lIGFib3ZlKiopLCBIZXJlLCBJIGFzc3VtZTogImV4cGVuc2l2ZSIsICJzby1zbyIsIGFuZCAiY2hlYXAiLiANCmBgYHtyfQ0KDQpgYGANCg0KIyBSZW5hbWVzIERhdGEgRnJhbWUNCg0KUGxlYXNlIHJlbmFtZSBhbGwgdmFyaWFibGVzIG9mIHlvdXIgZGF0YSBmcmFtZSAoKipkYXRhIGZyYW1lIHRoYXQgeW91IGhhdmUgZG9uZSBhYm92ZSoqKSBpbiB5b3VyIGxhbmd1YWdlLg0KDQpgYGB7cn0NCg0KYGBgDQoNCg0KIyBDYXNlIFN0dWR5DQoNCkFjY29yZGluZyB0byB5b3VyIGRhdGEgZnJhbWUsIHBsZWFzZSBwcm92aWRlIG1lIHRoZSBmb2xsb3dpbmcgdGFza3M6DQoNCiogRmluZCBvdXQgdGhlIGZyZXF1ZW5jeSBvZiBzYWxlcyBvZiB3aGljaCBtZW51IGl0ZW1zIGFyZSBiZXN0LXNlbGxpbmcgYXQgYEtvcGkgS2VuYW5nYW5gIENvbXBhbnkhDQpgYGB7cn0NCg0KYGBgDQoNCiogRmluZCBvdXQgd2hpY2ggY2l0eSBnb3QgdGhlIG1vc3Qgc2FsZXMgYXQgYEtvcGkgS2VuYW5nYW5gIENvbXBhbnkhDQpgYGB7cn0NCg0KYGBgDQoNCiogRmluZCBvdXQgd2hpY2ggY2l0eSBoYXMgdGhlIG1vc3QgZGlzY291bnRlZCBzYWxlcyBhdCBgS29waSBLZW5hbmdhbmAgQ29tcGFueSENCmBgYHtyfQ0KDQpgYGANCg0KKiBXaGljaCB5ZWFyIHdlcmUgdGhlIG1vc3Qgc2FsZXMgYXQgYEtvcGkgS2VuYW5nYW5gIENvbXBhbnk/DQoNCmBgYHtyfQ0KDQpgYGANCg0KDQoNCg0KDQoNCg0KDQoNCg0K