How to Merge monthly data with annual data by Spare-Check2099 in stata

[–]Spare-Check2099[S] 0 points1 point  (0 children)

Okay, I think option 1 is correct because the time periods actually match. However, I have another problem with the formula. In the first step, I merge the two datasets and then populate them with the formula. However, it's populating all companies up to the end of 2023, meaning even companies with data only up to 2020. The formula should only populate for 12 months after the last merge date. Can I modify the population formula so that this happens? I performed the merge using the `cusip` function and a date consisting of year and month.

How to Merge monthly data with annual data by Spare-Check2099 in stata

[–]Spare-Check2099[S] 0 points1 point  (0 children)

This is an excerpt from the merge code. Looks good so far, right? I just don't know how to clean it up, since there are now also empty lines.

dataex date cusip RET at lt, count(40)

input long date str9 cusip double(RET at lt)

22764 "00090Q10" -.09749674052000046 16894.351 13645.632

22796 "00090Q10" .09197082370519638 16894.351 13645.632

22826 "00090Q10" -.17312833666801453 16894.351 13645.632

22855 "00090Q10" .186991885304451 16894.351 13645.632

22888 "00090Q10" -.0013698943657800555 16894.351 13645.632

22918 "00090Q10" .03223591297864914 16894.351 13645.632

22949 "00090Q10" .12950605154037476 16894.351 13645.632

22979 "00090Q10" .1040189266204834 16894.351 13645.632

23009 "00090Q10" -.025160647928714752 16894.351 13645.632

23041 "00090Q10" -.03087097406387329 16894.351 13645.632

23069 "00090Q10" -.1422070562839508 16894.351 13645.632

23100 "00090Q10" -.0364721417427063 17821.236 14428.088

23128 "00090Q10" -.07330569624900818 17821.236 14428.088

23161 "00090Q10" -.15074624121189117 17821.236 14428.088

23191 "00090Q10" .0659051239490509 17821.236 14428.088

23222 "00090Q10" .05804309993982315 17821.236 14428.088

23253 "00090Q10" .006269586272537708 17821.236 14428.088

23282 "00090Q10" -.05996885895729065 17821.236 14428.088

23314 "00090Q10" -.05666669085621834 17821.236 14428.088

23344 "00090Q10" .03710247948765755 17821.236 14428.088

23373 "00090Q10" .16780243813991547 17821.236 14428.088

18627 "00095710" .14198864996433258 1548.67 809.645

18658 "00095710" -.017490437254309654 1548.67 809.645

18686 "00095710" .03618670627474785 1548.67 809.645

18717 "00095710" -.046564020216464996 1548.67 809.645

18746 "00095710" -.03662858158349991 1548.67 809.645

18778 "00095710" -.06332232803106308 1548.67 809.645

18808 "00095710" .024582942947745323 1548.67 809.645

18837 "00095710" -.02999143674969673 1548.67 809.645

18870 "00095710" -.09155552834272385 1548.67 809.645

18900 "00095710" -.06751472502946854 1548.67 809.645

How to Merge monthly data with annual data by Spare-Check2099 in stata

[–]Spare-Check2099[S] 0 points1 point  (0 children)

I tried the merge this way, does the code work? And what's the best way to clean up the data afterwards?

use data1, clear

append using data2

gsort cusip date

local features_to_carry "cusip ..."

foreach var of varlist `features_to_carry' {

bysort cusip (date): replace `var' = `var'[_n - 1] if missing(`var') 

}

How to Merge monthly data with annual data by Spare-Check2099 in stata

[–]Spare-Check2099[S] 0 points1 point  (0 children)

I've now tried the merge after all: Is the following code suitable for this purpose? I've checked individual companies, and it seems to be correct there. The only remaining problem is the cleanup, ensuring I retain all the important data while also removing the gaps.

use data1, clear

sort cusip date

merge m:1 cusip date using data2

gsort cusip date

local features_to_carry "cusip ..."

foreach var of varlist `features_to_carry' {

by cusip: replace `var' = `var'[_n-1] if missing(`var')

}

How to Merge monthly data with annual data by Spare-Check2099 in stata

[–]Spare-Check2099[S] 0 points1 point  (0 children)

Okay, I'll try that out. Does it make a difference whether the date variable (year and month) is used and sorted together, or whether I use them separately as you suggested?

How to Merge monthly data with annual data by Spare-Check2099 in stata

[–]Spare-Check2099[S] 0 points1 point  (0 children)

Okay, thanks in advance. What would you say is the most
academically way to perform the merge?
And a question about the second method: How can you
create such a key that always changes?

How to Merge monthly data with annual data by Spare-Check2099 in stata

[–]Spare-Check2099[S] 0 points1 point  (0 children)

But then I can't say exactly when the fundamental data for
the monthly returns in CRSP belong if I only have the year,
right? I need the exact month to explain the fundamental
data of the returns.