當你需要執行某程序,並且反覆執行它時,
最簡單的就是copy-paste..copy-paste..copy-paste..copy-paste這段程序,
ex.
假設手中有2005-2010 4筆資料(a2005 - a2008),想知道各年中y_05跟x的OR值等統計資訊,
*直覺懶人法; proc logistic data=a2005; model y1(event="1") = x1; run; proc logistic data=a2005; model y2(event="1") = x2; run; proc logistic data=a2006; model y1(event="1") = x1; run; proc logistic data=a2006; model y2(event="1") = x2; run; ... ... ... proc logistic data=a2008; model y2(event="1") = x2; run;但是,我們都知道這是很笨也很沒效率的作法,也不符合優雅coding的理念,
所以macro是有學起來的必要性的,
如果引進macro寫法會變成怎樣呢,
OPTION NOSYMBOLGEN MPRINT;
%MACRO logiMc(year, y_vb, x_vb);
proc logistic data=a&year;
model &y_vb(event="1") = &x_vb;
run;
%MEND logiMc;
%logiMc(2005,y1,x1)
%logiMc(2005,y2,x2)
%logiMc(2006,y1,x1)
%logiMc(2006,y2,x2)
%logiMc(2007,x1,y1)
%logiMc(2007,x2,y2)
%logiMc(2008,x1,y1)
%logiMc(2008,x2,y2)
MACRO實際改寫出來的logistic程式碼只有4行(%MACRO和%MEND中間包的code),原始寫法大約24行左右,節省了20行!
要不要學呢?! 當然看個人: )
其實macro不難,只是多了%和&符號而已,和一些參數設定的概念,
解釋如下:
由&開頭的變項稱為"巨集變項",
在要重複執行程序以外的指令程序,開頭要加上%,姑且稱作"巨集程序",
OPTIONS NOSYMBOLGEN MPRINT; *在log檢視巨集和變數內容;
%MACRO logiMc(year, y_vb, x_vb); *%MACRO-->宣告巨集的開頭,以及巨集名稱為logiMc,巨集接收3個參數year, y_vb, x_vb;
proc logistic data=a&year; *此處往下3行就是要重複執行的程序所在;
model &y_vb(event="1") = &x_vb; *由&開頭的巨集變項,會不斷被下方設定的巨集變數組代換,達到重複執行的目的;
run;
%MEND logiMc; *%MEND-->宣告巨集的結尾;
/***巨集會來此處抓取設定好的巨集變數組,在程式區不斷做代換;**/
%logiMc(2005,y1,x1)
%logiMc(2005,y2,x2)
%logiMc(2006,y1,x1)
%logiMc(2006,y2,x2)
%logiMc(2007,x1,y1)
%logiMc(2007,x2,y2)
%logiMc(2008,x1,y1)
%logiMc(2008,x2,y2)
了解macro的精神後,就可以自由配合do, while, if-then, cat, 或甚至把多樣程序包到一個macro裡面,for example:
先利用暫存檔作條件判斷
-->做retain 資料處理
-->再依條件output我要的檔案
//此處無需了解變數是什麼,只需了解怎麼編寫這樣的macro
OPTIONS NOSYMBOLGEN MPRINT;
%MACRO test(o_y, o_y1, m_flag);
%IF &m_flag=1 %THEN %DO;
DATA WORK.in7; set WORK.in6;
if season=&o_y & mdy(12,1,&o_y) <= indate < mdy(2,28,&o_y1) then output;
run;
DATA WORK.in8; set WORK.in7;
by id;
retain flu&o_y 0;
flu&o_y = flu&o_y+1;
if first.id then flu&o_y = 1;
run;
DATA in&o_y; set WORK.in8;
by id;
if last.id then output;
keep id flu&o_y;
run;
%END;
%MEND subset;
%subset(2004,2005,1)
%subset(2005,2006,1)
%subset(2006,2007,1)
這邊會發現我多加了一條"%IF &m_flag=1 %THEN %DO;"和一個新參數"m_flag",其實他的用意很簡單,就是"開關"的功用,
利用"如果m_flag=1(打開)就執行此組巨集變數"的概念,可以快速開關某組變數要不要跑巨集,
//進階寫法
當變數太多時,可能碰到設定巨集變數組麻煩的情況,
這時候可以配合do來達到更簡潔的寫法,
ex.
*記得,巨集語法開頭要加%;
%MACRO box( xStart, xEnd );
%do xValue = &xStart %to &xEnd;
proc surveylogistic data=aa;
model y = &xValue * &xValue;
run;
%end;
%MEND box;
%box( 1, 16 )
Like this !
歡迎提供補充和建議!
: )
沒有留言:
張貼留言